PHYSICS FOR 
MATHEMATICIANS 


MECHANICS I 


PHYSICS FOR 
MATHEMATICIANS 


MECHANICS I 


MICHAEL SPIVAK 


PUBLISH OR PERISH, INC. 


Publish or Perish, Inc. 


www.mathpop.com 


Copyright © 2010 by Michael Spivak 
All Rights Reserved 


ISBN 0-914098-32-2 
EAN 978-0-914098-32-4 


Printed in the United States of America 


for my 
Aunt Frieda 


PREFACE 


he purpose of this book, or possibly series of books, is indicated precisely 
by the title Physics for Mathematicians. It is only necessary for me to explain 
what I mean by a mathematician, and what I mean by physics. 

By a mathematician I mean some one who has been trained in modern math- 
ematics and been inculcated with its general outlook. No specific mathematical 
knowledge is expected, but for the purposes of this book on mechanics the mate- 
rial in. A Comprehenswe Introduction to Differential Geometry, Volumes | and 2, will 
generally be regarded as a prerequisite, not simply because I wrote this book, 
but because many of the concepts of mechanics are, in fact, best expressed in 
terms of basic differential geometric concepts. ‘This will always be referred to 
as DG, rather than Spivak [2], which is how it appears in the bibliography. 

And by physics I mean ... well, physics, what physicists mean by physics, 
i.c., the actual study of physical objects, even wheels, weights, ropes and pul- 
leys (rather than the study of symplectic structures on cotangent bundles, for 
example). In addition to presenting the advanced physics, which mathemati- 
cians find so easy, I also want to explore the workings of elementary physics, 
and the mysterious maneuvers—which physicists seem to find so natural—by 
which one reduces a complicated physical problem to a simple mathematical 
question, which I have always found so hard to fathom. 

As these remarks probably reveal, basically I have written this work in order 
to learn the subject myself, in a form that I find comprehensible. And readers 
familiar with some of my previous books probably realize that this has pretty 
much been the reason for those works also. I have been fortunate in being 
able to make a livelihood of sorts in this way, by indulging my desire to learn 
things in my own peculiar fashion while providing others with an account of the 
adventure. Perhaps this travelogue of an innocent abroad in a very different field 
will also turn out to be a book that mathematicians will enjoy (though physicists 
probably wil not). 

I am greatly indebted to many people and institutions for their help with 
this project. Richard Palais was, as always, an ever helpful and enthusiastic 
supporter of the project. Besides his help with mathematical questions, some 
discussions with him helped me enormously in understanding and formulating 
certain basic principles, though I hasten to add that he is not responsible for 
any heretical ideas that might appear here. Ted Shifin likewise provided un- 
stinting help, as well as probing questions. Larry Jackal gently steered me away 
from some stupid mistakes and over-simplifications, and Mitch Baker heroically 
undertook a thorough examination of the first draft of Part I, resulting in many 


Vil 


vill Preface 


corrections and improvements. John Milnor greatly contributed to my under- 
standing of one vexing topic, and among other helpful people whom I have 
pestered I should mention Ejisso Atzema, Robert Bryant, James Casey, Carmen 
Chicone, Poul Hjorth, Yildirim Hurmuzlu, Hermann Karcher, ‘[om Lehrer (as 
I couldn’t plagiarize), David Nadler, Anders Persson, John Polking, and Olivier 
Thill. 

I am grateful to the mathematics department at Rice University for affording 
me privileges allowing me to use the Rice University Library, and to the helpful 
librarians there, in particular, Erin McAfee and the science reference librarian 
Debra Kolah. Through the efforts of Martin Guest of ‘Tokyo Metropolitan 
University and Yoshiaki Maeda of Keio University, I was able to give a series 
of lectures at Keio University on the material of Part I, the first time I had the 
opportunity to present some of this material to a live audience. A written version 
of the lectures was made available on the web, which providentially allowed me 
to be contacted by a fellow mathematician, Bruce Pourciau, who had studied 
many of the same questions that I puzzled through concerning Newton’s work; 
many of his papers, listed in the Bibliography, provide additional details for the 
discussions of Chapters | and 2. 





Looking ~ ~ ahead 
seer se 


Like every self-indulgent author, I like the think that you will try to solve 
all the problems ... or at least glance at them! Some problems will be used 
or referred to later on in the text, or sometimes in a future problem, and this 
crystal ball will alert you to look at them. The number in the crystal ball is the 
page where the problem is first used or mentioned. For example, page 41 is the 
first problem with a crystal ball. 

I should also point out that the problems are provided mainly to help in 
understanding basic points, or to mention additional topics, rather than to pro- 
vide proficiency in solving physics problems, and their number decreases rather 
rapidly after part I. 


Michael Spivak 
puborperish@gmail.com 


CONTENTS 


Preface vil 
PART I. THE FOUNDATIONS OF MECHANICS .........2.2.. l 
PEOIGEUC Ha 6. de BAG oh exe 8 we ee ee Se Ee ee 3 
Chapter 1. Newtonian Mechanics... .... 2... 2. 2. ee, 7 
Wass and 100ce es. o X20: Gagea i es eS er Ge Se 8 
MGM ERSEIAW: 2. ce ee 2, oy ke ae, I Re eG ee Oe A 10 
MBHESCCOMC AWS ais. Se. Seascale. Sp nae at ot ie Gt eh a. ke ee S ee eS 12 
Mass and weight are different ... 19 
<S VEUTIOUSO CINELEME: 35. a: he Ge a> ike ai ie we AS, A ok 19 
MWe MIG law: a: & a. ao ae hehe ge Ss he BO eS Hs SS 21 
The lures of symmetry 29 
COMpOSIONOlIOKCes g/ s. Ho RH ES ERE eS 28 
Addendum 1A. It Isn’t Rocket Science 
(Why Easy Physics is So Hard: IT)... 2... 2 2 ee 32 
Addendum IB. Weight Versus Mass... 2... 2... 1. De, 37 
PrOOleMaS:. <6: a0 7g kt ie es Be A a LO OR, Si. ee eB 4] 
Chapter 2. Newton’s Analysis of Central Forces... ......... 55 
PROWICIMS:: 14 bce a ae es Be Re a, Bg ea 67 
Chapter 3. Conservation Laws .................24.4.4 79 
Conservation of momentum ............... 2.04. 80 
Conservation of angular momentum ............... 8] 
Conservation of energy: kinetic and potential energy ........ 87 
Conservation of energy in collisions... . 2... 2... 94 
Conservation of energy in general 97 
Addendum 3A. Whips and Chains 
(Why Easy Physics is So Hard: II) .. 2... 2... 0.020204. 100 
Addendum 3B. Follow the Bouncing Ball 
(Why Easy Physics is So Hard: III)... ........202024. 108 
PIOOICUNS: ~ fir the Shr Ke GS Se ie. BO em Gp Sn ws Be a ee Be 111 
Chapter 4. The One-Body and Two-Body Problems .......... 120 
The one-body problem ..................00.4 120 
“The motion of bodies in mobile orbits, 
and the motion of the absides” . . . 2... 2... 1. De 129 
The two-body problem ..............-..-. 204. 136 


x Contents 


“The attractive forces of spherical bodies”... 2... 2... 0. 
Addendum 4A. Ala Principia 2... ee 
Addendum 4B. Reduction to a One-Dimensional Problem .... . 
Addendum 4C. Rutherford Scattering ...........2.22. 
Addendum 4D. Bertrand’s Theorem ..........2.2.2.2.. 
Addendum 4E. Power Force Laws and Duality. ........2.. 
PRODIEMIS! 2 9.4.4.4 6h 6.2.o Gime & ee Oe Sh eh a EG 


Chapter 5. Rigid Bodies .. . 2... 2... 2. eee 
EPGUinOMmun «3.0 & 6. Ac ao em Bok Bw Bo de Be, BG Ge 
Virtual infinitesimal displacements ..............2042. 
Conheuravon’space. - 5 6.8.0 62 ee She SSE REM Ss BEY 
The principle of virtual work. . 2. 2 2 2 
d/Alembert sprinciple:: 2.4.4.6 44.4 oa bbe eee oe Kee 
(We WHETWAAENSOR; 2: 4.516. ome a ee 
Calculating the inertia tensor .............. 204. 
Rotation about a-axis) 4.4 we ee ae ES RR ee we 
INE TICCNERSY™ ms 8 Go ces & Ww Ee eS a HE Se a Sc 
Continuous: DOdIes. s. aud we eek ae ee aR ew ww 
Elementary examples’ & aio 4a & 2 e.s Hee ae Mew a! 2 
Addendum 5A. The Strong Form ofthe Third Law ........ 
PrOblemis: .« @ 2.3.6 as & @-o -. gat & ok Boe Oe hs GBS Be 


Chapter 6. Constraints .............. 0-0-0 2 eee 
Rigid bodies wCONntACt: wo wey) Soy A ee we a A 
AE DENCUNUI So de 6, GE OR. lh eet A ew BGP 
The compound (physical) pendulum... . 2... 2. 
Equilibrium and Stability . 2... 2... 2 eee, 
UCI ae Gra k. & arn, OS od eo ee 2 eA a 
PROUT «as ter gat ee ee Bt he, ee A ee A le 
Some subsidiary topics (time-dependent constraints and hinges) 
Holonomic and differential constraints ...........4... 
Finding the constraint forces... 2... 2. eee 
AWEPOUMNO SD RCEE: «oe... 2 Re Re eRe See ge ee 4 Ge 
Give a physics student enough rope problems... ........ 
Addendum 6A. The Bouncing SuperBal . 2... 2... 2... 
Addendum 6B. Statically Indeterminate Problems ......... 
PEODICINS 4. % «4%. 16 2 oe: eg @ Soe we Se wer ee See 


Chapter 7. Philosophical and Historical Questions .......... 
Early notions of conservation of momentum ............ 
Huygens and Galilean Invariance... . 2... 2... 


Contents 


Newton’s proof of the thirdlaw ................. 
The parallelosramlaw 24 ..,4 652 4490424 2444 25-44 
Newton at the hands of the scholars ...........2.2.2.. 


PART II. BUILDING ON THE FOUNDATIONS .........2.2.. 


Chapter 8. Oscillations ...............0. 2.000084 
Huygens’ cycloidal pendulum ............2.2.2.2022. 
‘The spherical penqulgnr 4. 4-%.8..¢- 2:8 & es aS. ewe O¥- Be 
SPEUIOS: 8. Gh Bogs oe te, et Sade oh ee Bk, a ee A 
Harmonic oscillauions:.% 2 ¢2.2 9442 oe eee YR Ew 
Damped oscilavions: 2 a0 4 wk ot ok a ae Se we 
Forced OsGillations:. 4. 64 4:8. 24 Bae SRA Re DS 
Damped forced oscillations ................2284 
Ciotipled: Oscillators: aad de uy Rt dn Re A OG ee io Boe dew we 
The double pendulunmr: a 2 4: occa! & Gwe Soe See he A ew es ES 
The vibrating string. . 2... . 
Addendum 8A. Abel’s Integral Equation ......... 2.2... 
Addendum 8B. Envelopes... .........2....-04-. 
Addendum 8C. Stability of Solutions of Differential Equations . . 
Problemis.: 2. %, sx 2.46 che et Bite. he te Ae oe we oS He EG 


Chapter 9. Rigid Body Motion ............2.2.2.2242802. 
Rotating coordinate systems ..................44 
he Bulerequavions: 2 =< 4 <4. 4-5.8.8 3. # & 6.e wh Gow Be 
Poinsot’s geometric description... 2... 1... ee ee 
The free symmetric top, in body coordinates. . .......2... 
The free symmetric top, in inertial coordinates... ........ 
BUlCIRaNGles. 4. ws ete ae A ee Bek. 2: Be @ OM ow eo 
The heavy symmetrical top . 2... 2... 2. ee eee 
The cuspidal case; fast tops... 2... 
Precessin@ (Ops: wi wow ge 2% 408 3.6 4 2S OG ee Ale BSS 
DICCPING (OPS x2 Jee se, Se ere. eh hoe Ewe ee Ge oe 
THESIS TOP: « 6.4 4.58 Se SS eS OP ee ee ee OE SE 
he: polareuspidal t0p> 2. 6.-d-%. 2 ads Ge bom Roe. & el de gh @ Be 
CSYTOSCOPES: 2. @ Gla Ye. Se eG ae Se Bw GOR DORs 
NINE BYTOCOMIPASS Goa xe dns es hoe ER sw. RS ws BP Bae & 
Precession of the equinoxes .... 2... 2. eee 
Addendum 9A. The Euler Equations for Rotating Principal Vectors 

the ROUMO ISG: 2) Gam. 8g UG Ree. Bod ao Se -e  - S 
Addendum 9B. Secrets of the Herpolhode. ........2.2.2.. 


Problems ....... 2.0.0. 0.0 eee ee ee ee ee, 


XI] 


XII Contents 


Chapter 10. Non-Inertial Systems and Fictitious Forces ........ 376 
TRE basi@:equauons: 2% 4. em. de ee cm He WOR ee 376 
The translational or acceleration force... ... 2... 7... 377 
‘he centiltical torce <a a; 4a ee SE ke eh oe ew 378 
The deflection ofa hanging body. ............2.22.2. 379 
The azimuthal or Euler force . 2... . 2... 2... ee, 381 
he-Goriohstoree- 2 4.4 #4 mod SS SSM Oo) we EER GR 381 
The deflection of a falling body. .............2.02. 382 
The southward deflection . .. 2... 2... 2 eee 386 
Stupid experimenter tricks... 2...) 2 ee, 386 
Foucault's pendulum «.. ¥. «@ au 4 Ae ee Bee owe oe Se ee 387 
Hurricanes and bath-tubs . ... 2... 2... 2... ee 392 
Machis Principle: ¢.5 24% & 2.6m o.% #4 HH God 2 -d-ard eS 393 
Addendum 10A. The Trojan Asteroids . ..........202.2. 396 
The restricted three-body problem .............22.2. 396 
PCAIUIEY” sf: ids er Se Se ae Oh et ge et wk A SS et Be 399 
Stability calculations ...... 2... 2. ee 401 
The collinear Lagrange points ...............022. 404 
Addendum 10B. The Southward Deflection .........2.2.. 405 
IGG BIeMiS’ sa ep dae: oe: cer ve ete le tn Se Ew, eG ee 407 

Chapter 11. Friction, Friend and Foe ............2.2.2.. 409 
he laws'OririchOn: j<-ip.2 3k % & of Se. aos hw BR So ered ex 409 
The: Painleve paradoxes: sw 2 be eR Bi A 412 
The noble game of bilmards .............2..20428. 417 
The Jellett mvariant «<2 4 2. 4. 6 % $6.84 & Ste & @ 2 454, 4 424 
Tippe Tops and hard boiled eggs... 2... 2... 2. 2 426 
IPCOWICMIS: <6: 8 -255.5. os ae, vey GR ee WE es A te es ee Bee 434 

PART III. LAGRANGIAN MECHANICS .........2.2.2.2.2.2.. 437 

Chapter 12. Analytical Mechanics. ...............2.. 439 
The mathematical arena for analytical mechanics ......... 439 
Specialized considerations for analytical mechanics . . . . 2... . . 440 
Maetanees-CQUAUONS ste: c so. ds eas He wk ee, ee 44] 
Using Lagrange’s equations ..................2. 443 
Constraint problems ................ 200008. 445 
Conservation of energy; action... ........... 2.24. 447 
Time-dependent Lagrangians ..........2.....242. 449 
Leaorance: MiMlipuers:. in. wv xg: eg. ee eh oe ae Bs Be Gs Be Me 45] 
Addendum 12A. Lagrange’s Rolling Disc... 2... 2... 452 


Problems: «4% @.% mw wk ow ® aw hoe we Oo a ee Rw we x 456 


Contents Xi 


Chapter 13. Variational Principles .........2.2.2.2.22.2.2.. 461 
The Fiderequauons: 2-c b.5 06466 6:4 dbs tee & oS eG 461 
Hamiultoms principle. « « &. #83 aa 4% ee we Be a ee ee 463 
Maupertuis and the Principle of Least Action . ........2.. 464 
Jacobr’s form of the principle of least action ..........2.2. 466 
INoethers: heoreny-2. 4-2 es. 2 Weyepie G B ow Bk ee a Re a es 467 
The lures of symmetry, advanced version ..........2.2.. 469 
Addendum 13A. Lagrange Multiphers for Conditional Critical Points . 471 
PEO OIG Uc ce see He de: <Se G- eae ee Be Ge ee ees we > Gs ee Ge 475 

Chapter 14. Small Oscillations .. ..............222. 476 
Problems. 440 & 4 & eee Boe de et Ws Se ea a a 482 

INTERLUDE: 2-6: % 2 BE eh OSS ie eS Se ole eG He 485 

Chapter 19. Eieht- 3-2. =. 6: 6-8)4.9 ae ae & aig Ee & om ee oe aS 487 
Optics im: annquny: 2°34 44.4.6 % 6%.4 eA om 2b oe Ee A 487 
WslaniceschiGlars: .2 4d: 4 @ se ee ee aS Ea GS oe we Be 487 
Kepler and (salileo) 2: 4p: pe a Bee, Sine ek OR & Sr ER te B 488 
Descartes: 4-3-8 eSB eg ket eo ha kee Beng ke Se 488 
I GOUTAVAT, = sein. de -aee oh vg eR Ay Be te iy wk ed oe ee Be Se 489 
LUV CONS nck ecg Gs eo Gh Oe A ee we 490 
INGWON: -e. 422 a, Se Ga ee De es Be Een eB oe ee BE SOE 495 
MAUPErLUIS:- 25-4 ae wok we em EE Ee ee ee Se oe 496 
DNASE a: 3a. Ge fee tee, Sh Se gna chty Sees ope ae ee ke ee ee 498 
Addendum 15A. Battling toa Draw ..... 2... 500 
Addendum 15B. Huygens’ Principle... .. 2... 2. 502 


PART IV. HAMILTONIAN MECHANICS 


From Aragonite to the Schrédinger Wave Equation. ........ 509 
Chapter 16. The Cotangent Bundle ..........2.2.2.22224. 51] 
Special features of the cotangent bundle... . .. 2... 2.2... ot 
he dcecendre transiorm 1. @ -a. =. gg, 4 © a. Ge. Bd he SSH Ge wed 513 
Addendum 16A. The Clairaut Equation. . . 2... 2... 518 
PrODIEMS: Soh ‘e ee oe Es ee oe 2 ke Be ee BS 920 
Chapter 17. The Interplay of Mechanics and Optics .. ........ 922 
Optics emulates mechanics ........ 2. eee ee ee O22 
Wadiss IGOPEML. ig. a-ck: weg fade aR OR. dee GS 923 
Fermat’s Principle and Huygens’ Construction... ....... 524 
Conical Refraction in Aragonite ... 2... 2... ee, 526 


Mechanics returns the compliment ..............2.. 527 


XIV Contents 


TLheequauens On 1M" oe 6.8 ook & 6 te Be aS Se Sw 
The partial derivativesof S . 2... 
A partial differential equation for S. 2... ee, 
Invariant definitions; the mterplay of TM and T*M ........ 
The extended Hamilton’s principle . . . 2... . 2... ee, 
Addendum 17A. Liouville’s Volume Theorem ........... 


Problems <6 % oe Ae, Se Gs eR eel 8 OMS BE ee we 


Chapter 18. Hamilton-Jacobi Theory ...........2.2.224. 
The complete integral «: .5. a. 6.5 w © wo, ee & Boe we ye Bo 
(Optional) Envelopes of solutions . .............022.. 
(Optional) Inverting the process; contact curves .......... 
JACOBS THEOREM =: « 4.8 arb & od 4 Bae BSD eee RS eS 
Jacobi’s theorem and mechanics ...........2.-..244 
Hamilton’s characteristic function ..............4.4. 
HAMILTON-JACOBI THEORY AND 

THE SCHRODINGER WAVE EQUATION. .........2.. 
Addendum 18A. Motion in the Field of Two Fixed Masses 

Geodesics on Ellipsoids . . 2... 2. ee, 
Addendum 18B. Huygens’ Construction for Hyperbolic Equations 
Problem. «4% %~iing & Bags 2-e RARBG RES SBE ER 


Chapter 19. Canonical Transformations ...........2.2.2.. 
Canonical transformations. ....... 2... ee 
Hamiltonian flows and integral invariants ............. 
Hamiltonian flows and canonical transformations ......... 
Generating functions ............ 2.0.2. -0- 204. 
Time-dependent canonical transformations ..........2.. 
Using generating functions to simplify Hamilton’s equations . . . . . 
Generating functions in the time-independent case ......... 
Other types of generating functions... 2... 2... 2... 
Addendum 19A. ‘Time-(In)Dependent Hamiltonians ........ 
Addendum 19B. Generalized Canonical Transformations ..... . 
OMICS: Neo i te. i he ee ed dB et ee de 


Chapter 20. Symplectic Manifolds ..............2022. 
Symplectic vector spaces . 2... 1. 
TSOUPOPIC SUDSPACES< a. gs Ho A ot. a Se A ee Se eh eh 
Symplecuc manivlolds: < «© .¢ 2.5 2 & 2 Gwe bk See OMe Ge SS 
POISSON: DIACKels. 6.4: ta.e & -G. Bas A oe i: a 
PoissOn,/DPacCkets 01S 6.0.4 :% 2.2 ee eb we Bee ee RO ee 
PRODICHIS; 4: se. Ge 4d. Bee Se, ek eae HH Re Ba 


Contents XV 


Chapter 21. Liouville Integrability .. 2... .......0..20202,. 614 
Functions in involution ............-2.2.202. 20284 614 
Conditional periodicity and the invariant tori... ........ 620 
Action-angle variables. . 2... 2. 2. 2 ee ee ee 630 
Action-angle variables on symplectic manifolds. . ......... 637 
Backeround) <. to 2 &.2 9, 3, a5 So + o,.e 4S. 3.9 BRE SA 640 
PrOmlenmae: 2. a a5 She is ee. Bee eh ewe: Ee as Oe 642 

Chapter 22. Epilogue .... 2... 2... 2. eee 644 
Adiabatic invariants. 2. 2... 2... 644 
The averaging principle .. 2... . 2. 646 
An averaging theorem for one-dimensional systems . ........ 648 
Adiabatic invariance of J 2...) we ee 651 
dite: blannay ancle: go. 46a: 2. GSE RR A eS eS 653 
e-dalthay NOGD 3.15, < ac: 4S a se ek AR Bs eH He Moar eed 656 
Foucault’s pendulum revisited ...............24. 661 
PrODIeMs:* 3 ow @ ew Bo Ae cee @ Sw Se ee See 665 

Supplement. APDE Primer ............2.2.2.2.2.222.. 667 

Bibliostaphy: «6 2.8% 4% #4 e HES A Gwe ERE ee SE Bs 693 
Unabbreviated Journal Titles ............... 248. 705 


PART I 


THE FOUNDATIONS 
OF MECHANICS 


PROLOGUE 


ma BO xai xIv@ THY yHv 
a place to stand on ! and I will move the Earth 


— Archimedes 


F 2 arepcrigd statement about the lever has come down to us in several forms, 
of which only the one appearing here is customarily translated with an 
exclamation point. This punctuation might even seem unnecessarily dramatic 
for modern tastes, because the lever no longer incites much wonder in us, so 
familiar are we with its principles. Yet who can forget the amazement of a child 
balancing an adult on a see-saw, simply by being placed at the right position. 
How could this be? Where did all that extra force come from!? 

The only wonder nowadays is that a physics student is unlikely to produce a 
satisfactory answer to this question. Perhaps we will be offered a few mumblings 
about moments, force times distance, laws of the lever ... perhaps even the 
“principle of virtual work”. But we probably won’t get an answer that seems 
to explain where that extra force comes from; and it is highly unlikely that we 
will get an answer that begins by establishing principles about rigid bodies, even 
though the rigidity of the lever is an absolute necessity for 1t to work. 

In fact, the whole path from Newton’s Laws, which basically concern “point 
masses’, to bodies whose shape and extent are significant, is often rather du- 
biously traversed, even though elementary physics courses blithely pose such 
problems of the most diverse sorts. 

Our progress in explaining the lever can be used as a measure of how far we 
have managed to bridge this divide, an appealing test of whether our description 
of modern mechanics manages to account for what is generally considered one 
of the first mechanical principals ever discovered. 


Archimedes, of course, didn’t simply enunciate the law of the lever, but as a 
true mathematical theoretician, he devised a proof. ‘The crux of Archimedes’ 
proof can be illustrated by the particular case where we have two objects, 





weighing 3 units and 2 units, respectively, positioned at points A and B whose 
distances from the fulcrum O are in the ratio 2 to 3, so that some unit distance 


3 


4 Prologue 


fits twice into the segment AO and three times into the segment OB. ‘To create 
a symmetrical situation, we begin by laying off 3 additional units, or the length 
of OB, to the left of A, and 2 additional units, or the length of AO, to the right 
of B, for a total of 10 units. 





Then we take a weight of : and place 10 copies at the centers of each of our 
units. ‘his arrangement is clearly balanced, because it is completely symmet- 





rical to the left and right of O. On the other hand, the 6 left-most weights 
(shaded below) have a total weight of 6, and their center of gravity is exactly 





at A, so they have the same effect as our original weight of 3 units at A. Simi- 
larly, the other 4 weights have a total weight of 2, and their center of gravity 1s 
exactly at B, so they have the same effect as our original weight of 2 units at B. 
And thus it follows that the original weights at A and B will balance! 


Archimedes doesn’t bother to illustrate the reasoning with a particular case, 
but simply provides the argument for the general case of commensurate dis- 
tances and weights, which doesn’t require any more generality, but does make 
the argument quite a bit harder to follow. 

Moreover, Archimedes then uses the “method of exhaustion” (the Greek geo- 
metric way of using the density of the rationals) to extend the result to the case 
of incommensurable distances. ‘This might seem like over-kill for a proposition 
about physical weights and distances, which can only be measured to within a 
certain accuracy, but then Archimedes was really a mathematician! 


Prologue a) 


Archimedes’ proof has been criticized, and emendments to it have been prof- 
fered, for centuries, culminating in a detailed examination in Mach [1], a book 
that has been enormously influential not only among historians and philoso- 
phers of science, but even among scientists themselves. 

Mach’s critique of Archimedes’ analysis of the lever dismisses the whole en- 
terprise as a symptom of the “Grecian mania for demonstration”. He presents 
two axioms of Archimedes, which may be translated as follows: 


1) Equal weights suspended at equal distances from a fulcrum are in equilib- 
rium. 


2) Equal weights suspended at unequal distances cannot be in equilibrium. 
The lever will be inclined towards the weight at the greater distance. 


Mach points out that at best these axiom say that the effect of a weight W at 
distance L from the fulcrum has a dependence of the form W - f(L) [for a 
monotonic f, though Mach doesn’t mention that]. How could the particular 
form W - L possibly be inferred from this? Instead, the replacement of one 
weight by several smaller weights implicitly assumes that the relationship is of 
this form. 


Despite the apparently irrefutable logic of this assault, it must be said that 
Archimedes’ argument certainly does seem awfully clever, and at first blush it 
even seems awfully convincing! So we should mention that Mach conveniently 
pretends that Archimedes proceeds from only two axioms, whereas Archimedes 
actually states eight, some of which specifically mention center of gravity, a 
concept which was apparently analyzed in an earlier lost work of Archimedes. 
In fact, Archimedes’ proof of the law of the lever forms the first part of a work 
entitled On the equilibrium of planes or centers of gravity of plane figures, in which 
Archimedes finds the centers of gravity of triangles, rectangles, trapezoids, and 
finally, in a bravura calculation, the center of gravity of a section of a parabola. 

The details of Archimedes proof of the law of the lever may be found in 
Dugas [1; pp. 25-27]. It is clear from Archimedes’ arguments that he regards 
the center of gravity of an extended body of weight W as a poimt where a 
“point mass” of weight W would have the same effect as the extended body. 
Moreover, the center of gravity of two equal bodies is supposed to be midway 
between them (presumably this is meant to apply only in the case where the two 


ce 


bodies are connected, say by a rigid rod of negligible weight). So, for example, 
Archimedes would claim that the following weights are balanced because the 


6 Prologue 


two bodies of weight 1 act the same as a single body of weight 2 situated midway 
between them. One might object that in this case the two smaller weights 





aren't connected by a rigid rod, but of course we are assuming that the weights 
don’t slide along the lever, so that it might be better to think of the weights 
as attached to the lever. Notice that Archimedes’ evaluation of the situation 





actually amounts to some sort of assertion about rigid bodies—something about 
how the rod combines the effect of the two end weights. 

I wouldn’t want to attempt to defend Archimedes’ proof too earnestly,! since 
our aim, after all, is to show in the next few chapters how the law of the lever 
arises as part of a systematic investigation of rigid bodies. But Archimedes’ 
analysis serves as a good starting point from which we can jump ahead a couple 
of millennia, to Newton. 


' For an extended discussion of Archimedes’ proof along similar lines, together with 
further references, see Dijksterhuis [1]. Also see van der Waerden [1] for a critique of 
Mach’s critique. 


CHAPTER 1 


NEWTONIAN 
MECHANICS 


he terms classical mechanics and Newtonian mechanics are used virtually 

synonymously, attesting to the fact that all of classical mechanics flows from 
Newton [2], Philosophie Naturals Principia Mathematica (Mathematical Princi- 
ples of Natural Philosophy), or simply ‘The Principia. 


It should be said at the outset that if you are trying to learn mechanics, the 
Principia is not the place to start! It has all the inherent difficulty and obscurity 
of any classic, with numerous approaches to fundamental considerations that 
are nowadays treated in quite different ways. But there is still considerable 
interest in examining parts of the Principia—after all, “Newton’s Laws”, with 
which all discussions of classical mechanics begin, come from this source, and 
there are some pleasant mathematical surprises awaiting us as well. 

All our quotations from the Principia are taken from 


Newton, The Principia. Translated by Bernard Cohen and Anne Whitman, 
University of California Press, 1999. 


This recent translation eliminates most obsolete terms, carefully explains any 
remaining old-fashioned terminology, and corrects various errors in the older 
translations. Moreover, it begins with an extensive Guide to reading the Prin- 
cipia that is almost as long as the Principia itself. 

We will essentially cover a nowhere dense subset of the Principia, and another 
recent book, on which I relied for many topics, will be extremely useful to any 
modern reader interested in exploring the enormous amount of material of 
diverse sorts that it contains: 


S. Chandrasekhar [2] Newton’s Principia for the Common Reader, Clarendon 
Press, 1995. 


This book provides detailed modern mathematical arguments (though not al- 
ways ones that correspond directly to Newton’s proofs) for a large portion of the 
work, together with many intriguing conjectures about the course of Newton’s 
thought. 


8 Chapter 1 


In the manner of a preémptive strike, I would like to say that the difficulty 
of reading the Principia is equaled only by the danger of commenting upon 
it. ‘The work has been mined so thoroughly by experts and scholars, and its 
elaborate historical roots have been examined so carefully, that any amateur 
venture is sure to produce some remark or interpretation that will be met with 
scorn by the cognoscenti. 

Nevertheless, we will be happily unconcerned while extracting and interpret- 
ing parts of the Principia in terms of modern physics. Several historical remarks 
about the Principia may be found sprinkled throughout Part I. 


In the manner of Euclid’s Elements, which it emulates, the Principia be- 
gins with two preliminary sections, “Definitions” and “Axioms, or the Laws 
of Motion’. And it would be fair to say that it shares the same virtues and de- 
fects as the Elements, the latter a work that exhibits admirable rigor only after 
a shaky beginning where we have to contend with definitions like a point as 
“that which has no part” and a line as “breadthless length”, while basic prin- 
ciples such as the “side-angle-side theorem” are presented as theorems with 
indefensible proofs, rather than as additional postulates. 


Reworking the Elements into a rigorous system hardly troubles modern math- 
ematicians. We simply give a list of undefined terms like “point”, “straight line”, 
and the notion of a “point lying on a line”, declare any basic results that we 
can’t prove adequately as axioms, and carry on. Of course, we might, for ex- 
ample, instead prefer to declare the straight lines to be particular collections 
of points, so that the notion of lying on a line reduces to set membership; this 
just illustrates that there will always be several possible ways of dealing with any 
collection of “undefined terms” that are supposed to have connections between 


them. 


For modern physicists, the role of “undefined” terms may be replaced by the 
idea of “operationally defined” terms: a concept is defined once we explain 
how to measure it. However, as we shall see, even this idea can present some 
difficulties and subtleties. 


Basic Concepts 


Mass and force. In mechanics, the first basic concept is that of the mass of 
an object. Newton initially calls this the “quantity of matter” in the Principia, 
which immediately commences with Definition 1: 


Quantity of matter 1s a measure of matter that arises from its density and 
volume jointly. 


Newtonian Mechanics 9 


Great acumen is hardly needed to realize that this definition is hopelessly 
circular, since density is normally defined as the ratio of mass to volume, but 
Newton’s unhelpful phrase does have some implicit implications. For example, 
we expect the mass of an object to remain unchanged if we change its shape, 
that the mass of a quantity of water remains the same after it has been frozen 
into a piece of ice, or even reconstituted as powdery snow, and the mass of a 
quantity of air or other gas remains the same if we confine it to a smaller region. 

But our modern conception of mass is better reflected in what Newton soon 
afterwards refers to as the “inertia of the mass”: a body’s resistance to being 
moved if it is at rest—or of having its velocity changed if it is already moving 
with a uniform velocity. 

For example, consider two solid balls, say about 3 feet in diameter, one made 
of cork, the other of iron, resting on a smooth floor. In order to get the cork 
sphere rolling at a speed of about 5 feet per second, we would have to push 
on it a bit, or, in everyday terms, exert a small force on it. But in order to get 
the iron sphere rolling at that speed we would have to push much harder, or, in 
everyday terms, exert a much large force on it. It is these different experiences 
that lead us to regard the iron ball as having much greater mass than the cork 
ball. 

Thus, roughly speaking, the mass of an object is measured by its resistance 
to being moved by a force, and of course force is the other basic concept of 
mechanics, and we simply have to assume that we have some idea of what we 
mean when we say that a magnet exerts a force on a piece of iron, or that 
gravity exerts a force on any object, or that a force must be applied to the end 
of a spring to stretch it out, and the spring exerts a force pulling the stretched 
end back towards its original point. ‘The force that a person must exert for some 
purpose, like moving a cork or iron ball, is the one that connects most directly 
with our everyday experiences, though it is naturally the one that we might be 
hardest pressed to quantify accurately. 

Implicit in this discussion, by the way, is the important idea that forces have 
not only a magnitude, but also a direction; the force of gravity 1s exerted toward 
the center of the earth, and the force exerted by a spring 1s directed along 
the axis of the spring. Thus it is natural to represent forces mathematically by 
vectors, i.e., by elements of R°, although we often picture them as arrows. 


Newton makes one other important definition, which seems much less prob- 
lematical: “Quantity of motion”, or what we now call the momentum of an 
object, is simply the product m - v of its mass m and its velocity vector v. But 
even this definition hides difficulties, because it assumes that the motion of an 
object can be described in terms of a curve c: R > R3?, with velocity vector 
v = c’, which is certainly not the case for a ball rolling down an inclined 


10 Chapter I 


plane, or a rod revolving as it is thrown. In essence, we are constricting our 
initial considerations to “particles ”, or “point masses”, abstractions that don’t 


pa 


actually exist, but that represent reasonable first approximations in appropriate 
cases—for example, in discussing the motion of the planets around the sun. 
Mathematically, a particle is just a path c: R — R?, with derivative c’ = v 
together with anumberm>0OeER. 

Newton doesn’t offer any operational definition of mass, or of force, leaving 
us with these rather vague intuitive concepts. He ends the “Definitions” section 
with a Scholium (commentary) rather longer than the definitions themselves, 
but it sheds no light on this matter, instead treating other topics that we will be 
in a better position to consider after we examine the material that appears in 
the next section of the Principia. 


) 


Newton’s Three Laws 


The first law. The second section of the Principia, “Axioms, or the Laws of 
Motion’, begins immediately with the statement of Law 1: 


Every body perseveres in its state of being at rest or of moving uniformly 
straight forward, except insofar as it 1s compelled to change its state by forces 
umpressed. 


Nowadays we might state this as follows: 


An object not acted upon by any force has a constant velocity v, and, 
in particular, if it 1s initially at rest, then it remains at rest. 


Newton, of course, did not call his law “Newton’s First Law’, and in the 
lengthy Scholium to the second section he describes it as one of the principles 
“accepted by mathematicians and confirmed by experiments of many kinds”, 
explicitly mentioning Galileo, who was the first to enunciate it. Galileo’s con- 
tribution was not so much an experimental “verification” of this law as an ac- 
cumulation of experiments and reasonings to explain why everyday experiences 
seem to contradict it. Nowadays we can illustrate the law rather dramatically by 
sliding an object along a glass table with dry ice evaporating from it, forming a 
cushion of gas that practically eliminates friction. Or we can slide objects along 


Newtonian Mechanics 11 


an “air trough”, a track with compressed air blown through holes in its sides, so 
that our object is sliding along a thin layer of air (see Neher and Leighton [1]). 


compressed air compressed air 


side view of a block sliding on an air trough 


In addition to the leap of imagination required by the first law, further clari- 
fication is required because the notion of position, and thus of velocity, depends 
on the coordinate system used by the observer. Newton’s first law essentially 
distinguishes certain spatial coordinate systems, like those set up in a stationary 
room, from others, like the coordinate system used by some one in an accelerat- 
ing train, or by an observer on a stone being swung in an arc by a sling shot. It 
basically states that there are certain coordinate systems in which Vv is constant 
for any body not acted upon by forces. Such coordinate systems are often called 
“inertial systems”, because they exhibit a body’s zertia, its tendency to stay at 
rest or in uniform motion unless acted upon by some force. ‘Thus, the first law 
might be stated more completely as 


Newton’s First Law: ‘There is at least one coordinate system—an 
inertial system—in which any object not acted upon by any forces has 
constant velocity. 


Often, the first law is simply referred to as the law of inertia. 


Of course, all sorts of philosophical problems might arise if we enquired too 
closely into the distinction between definitions and observational facts that the 
first law entails—a danger with any axiomatically developed system—but two 
specific points should be made. 

First, it is clear that any coordinate system moving with a uniform velocity 
with respect to an inertial system is itself an inertial system. On the other 
hand, determining even one inertial system might be a little more difficult than 
expected. Although a coordinate system set up in a stationary room acts as 
an inertial system for various earthbound experiments, it clearly isn’t really an 
inertial system, since it is not only rotating in a 24 hour period around an 
axis, but also rotating in an annual period around the sun. A much better 
approximation to an inertial system is one based on the “fixed stars”, although 
we now know that the stars aren’t all that fixed either! 


12 Chapter 1 


In Newton’s time the “fixed stars” (1e., the heavenly bodies that were not 
planets or comets) were indeed thought to be at rest with respect to each other, 
but in any case Newton would simply have regarded a reference frame based 
on the fixed stars as an excellent experimental approximation to the “absolute 
space” that he refers to in the Scholium at the end of the “Definitions” section, 
this “‘absolute space” presumably being the inertial system that is “really at rest”, 
rather than one moving with some uniform velocity with respect to it. 


Newton’s Scholum (Newton [2; pp. 408—415]) includes a long discussion of 
the distinction between absolute and relative motion, ideas that have led, es- 
pecially with the advent of relativity theory, to considerable fundamental phys- 
ical/philosophical questions. Many elementary mechanics texts now include 
a discussion of these matters, motivated by the impulse to introduce ideas of 
modern physics as rapidly as possible. On the whole, we will allow them to 
fester until the proper time, in another volume, although Chapter 10 and some 
remarks in Chapter 7 touch on some of these topics. 


The second law. Newton also credits Galileo with the next law, which may be 
formulated in modern terminology as 


Newton’s Second Law: In an inertial system, the rate of change of 
momentum of a particle is directly proportional to the force F acting 
on it: 


F = (mv)’ 


—=m-v'. 


Here we have taken the constant of proportionality simply to be 1, since this 
just amounts to choosing a unit of force once units for mass and length and 
time have been determined. 


Note, by the way, that our final version of the first law emphasizes that it 
cannot be regarded simply as the special case F = 0 of the second law, since it 
is the first law that defines an inertial system. 


In the Principia, where concepts of calculus are eschewed as much as possible 
at the beginning, Newton speaks merely of the change of momentum, rather 
than its derivative, so the second law seems to be stated in terms of “impulsive 
forces” that act “instantaneously”—a hockey stick hitting a puck might be a 
good approximation to this ideal—though Newton certainly uses the second 
law in the more general sense whenever he needs it: 


A change in motion |momentum| 1s proportional to the motive force impressed 
and takes place along the straight line in which that force 1s impressed. 


Newtonian Mechanics 13 


At first sight, it’s hard to imagine how the second law could play a central 
role, or indeed any role, in the foundations of mechanics. On the left side of 
the equation 


F=m-v’ 


we have the quantity F that we have discussed only in vague terms, without 
determining how to measure it, and on the right side of the equation we have 
the quantity m that has also been discussed only in vague terms, again without 
determining how to measure it. The equation F = m-v’ might well be called 
an “axiom”, in the purely mathematical sense of the word, but why would we 
think of it as a “law of motion’, presumably something that we could check by 
experiment!? 

Nevertheless, the situation is not quite so hopelessly muddled as it seems. In 
fact, in his Scholium to the second section, Newton specifically cites Galileo in 
regard to the second law because of Galileo’s observations “that the descent of 
heavy bodies is in the squared ratio of the time”. Newton proceeds to explain 
that this is a consequence of the constant acceleration of gravity, working from 
the impulsive case to the continuous case: 


When a body falls, uniform gravity, by acting equally in individual 
equal particles of time, impresses equal forces upon that body and 
generates equal velocities; and in the total time it impresses a total 
force and generates a total velocity proportional to the trme. And the 
spaces described in proportional times are as the velocities and the 
times jointly, that is, in the squared ratio of the times. 


This is the sort of explanation that makes you glad that you aren’t trying to 
learn mathematics in the 17" century! Here, perhaps, is what Newton seems to 
be saying. Suppose we divide the time ¢ of descent into small intervals of length 
At =t/N for a large number N, and regard the motion with constant accel- 
eration a as being uniform on each interval, with an “instantaneous” change 
of speed of a- At at the beginning of each interval. Our body, starting at rest, 
has velocity a - At during the first time interval of length At, falling through a 
distance of a(At)’; it has the velocity 2a - At during the next time interval of 
length ¢, falling through a distance of 2a(At)”, ... . Thus, at the end of time f 
it has fallen a distance of (1+2+3+---+.N)-a(At)* = SaN(N + 1)(At)?. 
Since NAt = ¢, this is close to Jat? 


Nowadays, of course, we just say 


2 


If s” =a for a constant a, then s’ = at, and thus s = <at?. 


14 Chapter 1 


Even after all this explanation, we seem to be concerned with a purely math- 
ematical result about second derivatives. So it’s important to go back and look 
at one particular phrase in Newton’s argument: 


When a body falls, unzform gravity, by acting ... 


How do we know that gravity zs uniform? 

Note that here we are not considering the question famously associated with 
Galileo concerning the effect of gravity on two different objects. We have only 
one object, and are asking how we know that the force of gravity on the object is 
constant throughout its descent, 1.e., independent of its height. (Of course, that 
isn’t actually true, since the force varies inversely as the square of its distance 
from the center of the earth, but the change is insignificant for any distances 
close to the radius of the earth.) 

We can exhibit the effect of the force of gravity by a measurement with a 
primitive scale consisting of a spring attached to an upper support which is kept 
fixed, say by nailing it to a wall, with our object attached 
to the bottom, pulling the spring down. Experience tells 
us that if we nail the upper support higher up on the 
wall, the spring still stretches to the same length; if we try 
the experiment on the third floor of the laboratory, the 
length is still the same; if we carry out the experiment at 
the top of the leaning tower of Pisa, the length 1s still the 
same. ‘Thus we see that the downward force of gravity 
on a particular object doesn’t depend on the height of our object above the 
earth’s surface—we can use our primitive scale as a way of ascertaining that 
two forces are equal without having to worry about the question of just how we 
should assign a magnitude to forces. 

Note that this experiment doesn’t involve the acceleration our object would 
have if we allowed it to fall, it stmply ascertains that the force of gravity is the 
same throughout the downward path of the object when it does fall. Once we’ve 
determined that the force of gravity is constant, we note that the assumption 

= mc” amounts to asserting that the downward acceleration is constant, and 
the latter assertion is independently verified by the observed result that the 
distance traveled by the body at time f is proportional to ¢?. 

Of course, Newton and others didn’t perform preliminary experiments of the 
exact sort we have outlined, but innumerable weighings had been recorded at 
different heights without any mysterious discrepancies; in more prosaic terms, 
the force needed to raise an object seems to be the same whether it is resting 
on the top floor of a tower or on the ground floor. 


Newtonian Mechanics 15 


Now that we’ve seen that the second law does have some significance, we can 
try to concoct an operational definition of mass. We’ll begin with a definition 
that is conceptually very straightforward, although it would certainly be rather 
awkward to use in practice. 

First we want to have a very long air-trough, with a carriage, of negligible 
mass, in which we can place a body whose mass we want to measure. Parallel 
to this air-trough we have a track with a little cart that can be pulled along the 
track with any desired acceleration a; for simplicity, let’s imagine that we merely 
have to turn a knob on an instrument panel to vary a, without worrying about 
the clever mechanism that would be required to produce this effect. Of course, 
our track will have to be very long if we expect to pull the cart with constant 
acceleration a for any reasonable amount of time. 


carriag 





Top View 


a 


cart 


We attach an extending arm to our cart that can be placed behind the carriage 
on the air-trough so that the carriage is moved in tandem with the cart. But 
instead of placing this arm directly behind the carriage, we will put a nice strong 
spring between them. 


spring 





Now let’s choose a particular body Bo that we want to be our “unit mass”, 
so that we will assign it mass m = 1. We place this body in the carriage and 
pull our cart with some convenient constant acceleration ag. Initially, of course, 
the carriage will not move with the same acceleration, because the spring will 
compress somewhat, so that the carriage won’t move exactly in tandem with 
the cart. But we very quickly reach the point where the spring is no longer 
compressing, or at any rate the length of the spring is constant within the limits 
of accuracy of our measurements. We carefully measure this final length, and 
call it Lo. 

Note, by the way, that this whole set-up is dependent upon our original 
experimental observation that equal forces produce equal acceleration: the 


16 Chapter 1 


compression of the spring measures the force that is being applied to the car- 
riage, and once the carriage is moving with a constant acceleration, the force 
applied to it must be constant. 

Now let’s take some other body B whose mass m we want to determine. 
We place B in the carriage instead of Bo and once again pull our cart with 
constant acceleration do, and observe the final length of the compressed spring. 
It probably isn’t Lo any more, so we try adjusting the acceleration a in order 
to make it become Lo: if the spring was compressed more, to a length < Lo, 
we try an acceleration < do, if it was compressed to a length > Lo we try an 
acceleration > dg. After lots of trial and error, we finally find an acceleration ay 
which compresses the spring to exactly the length Lo. We now define 


mass m of B = ao/a. 


This definition makes the law F = ma work for any particular fixed F, and 
much experimentation would show that it works just as well for any other F; in 
other words, if we repeated this whole process using a different spring, and thus 
a different Lo, we would still end up assigning the same masses to all bodies. 
Then, of course, we can use the equation in reverse, as a way of measuring force, 
by seeing what acceleration 1s produced on a body of some known mass m. 


A few subtle points ought to be mentioned here. First of all, although our 
original little experiment with the spring is certainly consistent with F = ma, 
it would hardly seem to be very conclusive. After all, how do we know that 
the correct law isn’t something like F = ma + ka’ for some constant k, so that 
third derivatives, or even higher derivatives, are involved? I don’t know of any 
experiments to directly test this, but there is an enormous body of experience 
that attests to it: the force of gravity isn’t constant over large distances, so all 
the calculations that keep satellites in motion, guide space ships to the moon 
and land them, etc., present a great deal of evidence. Newton provided the 
most important evidence of all, by showing that this law, together with the 
inverse square law for the force of gravity accounts for the elliptical orbits of the 
planets. (You might think that the argument is somewhat circular, for Newton 
essentially postulated an inverse square law, rather than relying on numerous 
experiments to verify it. However, Newton did test his law in one very important 
case, the motion of the moon around the earth. As a matter of fact, Newton 
initially shelved his whole theory of gravity because this case didn’t agree with 
measurements; later on, more accurate measurements of the radius of the earth 
then confirmed the idea—see Problem 2-2.) 

A second point involves the “additivity” of mass: if an object is made up of 
two parts, we expect its original mass to be the sum of the masses of the two 
new pieces, an idea inherent in the very notion of mass as “quantity of matter”. 


Newtonian Mechanics 17 


But this hardly seems clear with our nice precise operational definition! If 
have two objects of masses m, and mz, our operational definition means that a, 
is the acceleration that the first body must be subjected to in order to compress 
the spring to length Lo, while a2 is the acceleration that the second body must 
be subjected to in order to compress the spring the same amount. If we join the 
two objects together by placing them together in the carriage on our air-trough, 
then the new object should have mass m; + mz, which means, according to our 
definition, that to obtain the same compression for the two objects together, 
they must be subject to an acceleration a satisfying 

Bm ae eee, 

a ay a2 a; + a2 
At first glance, this might seem to be a strange fact that one might never have 
anticipated, one that could only be verified by a large number of experiments 
with varying masses, and perhaps never even divined from the experimental 
data obtained! Problem 25 gives a more promising approach. 

Finally, there is one other point that needs to be emphasized. When we first 
investigated the “uniformity” of gravity (page 14), we said that “the force of 
gravity on the object is constant throughout its descent, 1.e., independent of its 
height.” But we should also have said that it is zndependent of its velocity; this 
was implicit in our mathematical description of the problem by the equation 
SSG. 

It is worth pointing this out, not only because it was clearly implicit in New- 
ton’s statement of the second law, a subtlety obviously clear to Newton, but also 
because it is false, a subtlety that was not known to Newton, or to any one else 
until 1905, when Einstein discovered it by purely theoretical means. 

If we were to measure the acceleration produced by a force of magnitude F 
on an object moving with varying speeds v, including speeds v close to the 
speed of light c, we would find that if F = mov’(O) when the object starts at 
rest, then when the speed is v we have 


7 (i — y2 /¢2)3/2 ; 


‘To be sure, no one ever states the result this way. Instead of saying that the 
same force produces a smaller acceleration on an object when it is moving faster, 
physicists always say that a moving object has a larger mass. You might think 
they would say that the mass of of a body moving with speed v is (1—v?/c?)~?/? 
times its mass at rest, but they don’t say that either (see Problem 8). 


Leaving aside this relativistic subtlety, which won’t pop up again anywhere in 
this volume, we simply want to mention a less direct, but much more convenient, 


18 Chapter I 


operational definition of mass, based on the mathematician’s and physicist’s 
common view that a straight line is just a circle of infinite radius. 

Instead of using an air-trough, we simply attach our body B to the end of 
a very stiff spring that is being rotated horizontally with some large constant 
“angular frequency” a, so that B moves along the circle 


c(t) = R(cosat,sin at) 


for some radius R. This radius R will be somewhat larger than the unstretched 
length of the spring, because B actually begins moving along a spiral, pulling 
the spring out, though its path soon becomes indistinguishable from a circle. 


For the acceleration we simply have 
News 


so that the acceleration always points directly inward, and has magnitude Ra’. 
This means that the force F that the spring exerts on B also always points 
directly inward and has constant magnitude. 

Declaring B to be our unit of mass amounts to saying that 


(a) IF| = 1- Ra’. 


(A) ¢”(t) = —Ra?(cosat,sinat), 





‘To determine the mass m of any other body, we attach it to the end of our 
spring and vary the angular frequency with which we rotate it until we arrive 
at an angular frequency # for which our body is moving along a circle 


c(t) = R(cos Bt, sin Br) 
of the same radius R. Now we should have 
(b) IF| =m- Rp’, 


with the |F| in equations (a) and (b) having the same value, since in both cases 
the spring has been stretched by the same amount. In other words, we can 
determine m by 

m = a? /B?. 
We've ignored the effect of gravity on these bodies, but that would become 
negligible in comparison to the force of our stiff spring when a 1s large, or we 
might imagine the measurements being made in outer space. 

I’m sure that the basic mechanism for this definition could be greatly refined. 
Instead of a spring, one might whirl a tube filled with mercury, and measure the 
compression of this mercury column, etc. But I don’t think any one has ever 
actually produced a mechanism of this sort. In fact, as far as I know, no one 
has ever measured the mass of anything accurately. ‘This statement obviously 
requires a bit of explanation! 


Newtonian Mechanics 19 


Mass and weight are different... ‘The downward force of gravity that Newton 
refers to in reference to the second law and Galileo’s observations of falling 
bodies is, of course, what we usually call the wezght of the body. 


Neophyte physicists are always warned not to confuse mass, a measure of an 
object’s resistance to being moved, and weight, a measure of the gravitational 
force exerted on it by the earth. Nowadays we might emphasize that distinction 
by pointing out that the mass of a body on the moon would still be the same 
even though its “weight”—the gravitational force exerted on it by the moon— 
would be much less. Even more dramatically, we could consider a space ship 
cruising at a constant speed through space, far from any planets; in this space 
ship, a large iron ball and a large cork ball would both float effortlessly without 
any support from the floor(s), but moveng the iron ball would be much harder 
than moving the cork ball! If the two balls were painted the same color, one 
can imagine all sorts of unpleasant practical jokes that might be perpetrated on 
a naive space cadet. 


... yet not so different. On the other hand, even after comprehending the 
distinction, we might very well suspect that the relative masses of two bodies are 
the same as their relative weights. 


On a crude scale, this idea certainly conforms to our experience. For example, 
knowing that the weight of our iron ball—the upward force that we would need 
to exert to keep it from falling—is much greater than the weight of our cork 
ball, we are hardly surprised that the force required to set the iron ball rolling 
at a certain speed is much greater than the force required for the cork ball, and 
we shouldn’t be surprised that the same holds true in the cruising space ship. 

In terms of the second law, however, we can make a much more specific 
correlation: Since the weight of an object, of mass m, say, is the force F of gravity 
on it, the law F = m-v’ means that the ratio F/m of weight to mass is simply 
the acceleration that an object undergoes under free fall. ‘Thus, proportionality 
of weight to mass is equivalent to the assertion that all bodies fall with the same 
acceleration, the famous fact usually attributed to Galileo. 


Incidentally, even before Galileo’s experiments, J.-B. Benedetti (1530-1590) 
had argucd that different size bodies of the same material must fall with the 
same velocity, since a large body could simply be considered as two smaller 
bodies side by side (compare the arguments on page 27). But Benedetti still 
thought that denser bodies would fall more rapidly than less dense ones, and 
Galileo is usually credited with being the first, or at least one of the first, to 
assert the equal acceleration of bodies with differing compositions, like wood 
and iron—or, more to the taste of a modern physicist, like aluminum and gold, 
which have such different proportions of protons and neutrons. 


20 Chapter I 


Although Galileo’s experiments—real or mythical—may have been the first 
investigation of this fact, it can be verified with much greater accuracy by re- 
sorting once again to the trick of replacing linear motion by circular motion, in 
this case by considering a pendulum. Problem 17 talks about the pendulum in 





a little detail, but basically the pendulum bob moves in a circle about the pivot 
point because the downward gravitational force on the pendulum bob is par- 
tially offset by a force directed along the string that is just large enough to result 
in a motion tangent to the circle. Instead of trying to compare the downward 
path of two falling objects, we can instead use them as the bobs for two pen- 
dulums of the same length, and see if they move at the same rate—something 
that one can verify with great accuracy by letting the pendulums swing through 
many cycles. Addendum 1B discusses experiments of this sort in much greater 
detail. 

Once we've found that weight is strictly proportional to mass, it becomes 
unnecessary to measure mass, 1.e., to find the ratio of masses of different objects, 
because this ratio is Just the same as the ratio of their weights. Even though 
weight may vary with location, the ratios of weights at an given location give 
the ratios of the corresponding masses. 

Naturally, the proportionality of weight to mass only adds to the usual con- 
fusion between these two concepts. In fact, before Newton, the distinction had 
hardly ever been made, and the weight of an object was generally regarded as 
an invariable property of the object. It’s not surprising that hardly any thought 
was given to the idea that the weight of an object might vary with its distance 
from the earth, since any such variation would have been extremely difficult to 
observe directly. 

Indeed, it was indirect measurements that accidently provided the first evi- 
dence that the weight of a body changes at different distances from the earth. 
In 1672, the astronomer Jean Richer, making observations at Cayenne (latitude 
5° north), found that his pendulum clock, which kept perfect time in Paris, was 
going, compared to the mean motion of the sun, more slowly by 2 minutes 
and 28 seconds per day, though he couldn’t explain the difference; and similar 
observations were made later by others, including Halley. Newton interpreted 
these observations as an indication that the earth bulged near the equator, be- 
cause of its rotation, so that locations near the equator were further from the 


Newtonian Mechanics 21 


center of the earth. In the third book of the Principia, Newton used this data to 
estimate that the equatorial radius of the earth was about 17 miles greater than 
its polar radius.!_ Modern estimates give about 21.4 km, or about 13 miles, a 
discrepancy mainly due to the higher density of the earth’s core. 

Separating the idea of mass and weight, deciding on mass and force as the 
basic concepts in terms of which others should be defined, and choosing the 
first two laws as the basis for deducing other results, was one of Newton’s main 
achievements. ‘The success of his choices is one of the reasons that we still speak 
of classical physics as Newtonian mechanics. 


The third law. Strenuous exertions have allowed us to tease out a bit of mean- 
ing from the first two laws, both of which involve individual bodies, but say 
nothing about the intcractions between different bodies. ‘This mformation is 
given by Newton’s third law, and since all of mechanics supposedly rests on 
Newton’s three laws, this one must really be a doozy. In fact, it is usually stated 
as a memorable apothegm: “Every action has an equal and opposite reaction”. 
In this form it 1s ideally suited to misappropriation by armchair philosophers, 
moral and political thinkers, and others of that ilk. 

But Newton’s statement was much more specific: 


Newton’s ‘Third Law: 

To any action there is always an opposite and equal reaction; in other words, 
the actions of two bodies upon each other are always equal and always opposite 
an direction. 


Thus, if one object exerts a force F on a second object, then the second object 
exerts the force —F on the first. Numerous misuses and invalid analogues of 
the third law ignore this basic fact that the two actions in question are exerted 
on two different bodies. 

A simple example of the third law is provided by the gravitational force F 
that the earth exerts on an object: according to the third law, the object itself 
exerts a force —F on the earth; of course, the resulting change in momentum 
of the massive earth is basically unnoticeable, because it involves such a small 
change in the earth’s velocity. 

Newton gives various everyday examples of the third law: “If anyone presses 
a stone with a finger, the finger is also pressed by the stone. If a horse draws 
a stone tied to a rope, the horse wil (so to speak) also be drawn back equally 
toward the stone ... ” Like all “simple” physics examples, these are actually 


' The argument (using some material from Chapter 4), and some historical background, 
can be found in Chandrasekhar [2; pp. 381 f£], or quickly set out at the beginning of 
the comprehensive book Chandrasekhar [1], examining the many later extentions of 


the method; a simple outline may also be found in Hand and Finch []; pp. 339 ff]. 


22 Chapter I 


complicated phenomena that nowadays we might consider to be compounded 
from myriad instances of the third law applied to the atoms of which these 
everyday objects are composed. But for the study of mechanics, objects like 
billiard balls often serve as basic constituents. 


As a particular instance of the third law, we consider the collision of two such 
objects: By, having mass mj, and Bz, having mass m2. During the collision, 
B, will be exerting a force Fj2 on B;, while B, will be exerting a force F2; on Bz 
(the first subscript indicates the body on which the force acts). We should really 
write Fy2(t) and F2;(t) because these forces may vary with time; in fact, they 
presumably vary in an incredibly complicated way, depending on the particular 
way that the two bodies are compressed, spin, vibrate, undulate, bobble, etc. 
But we always have Fj2(t) = —F21(t), so for all times t we have 


(mV, +m2V2)' (t) = (m1) (t) + (m2vz2)'(t) 
= Fy2(t) + Fait) 
= (), 


Thus, no matter how complicated the collision may be, the “total momentum” 
mV, + M2V2 1s constant. Or, as we like to say, momentum is conserved. 

Newton explicitly stated this “conservation of momentum” law as a Corollary 
of the laws of motion: 


The quantity of motion, which 1s determined by adding the motions made in 
one direction and subtracting the motions made in the opposite direction, 1s 
not changed by the action of bodies on one another. 


In fact, it was the experimental verification of conservation of momentum that 
Newton cited as the evidence for the third law. In his Scholium he states that 
“Sir Christopher Wren, Dr. John Wallis, and Mr. Christiaan Huygens, easily the 
foremost geometers of the previous generation independently found the rules of 
[collisions of bodies], and communicated them to the Royal Society at nearly the 
same time,” specifically mentioning that “Wren additionally proved the truth of 
these rules ... by means of an experiment with pendulums ... .” Pendulums, 
with their advantage of replacing linear motion with circular motion, were also 
the device of choice in the 18" century for practically eliminating friction, with 
the added advantage that one could estimate the speed of the pendulum bob at 
the bottom of its path by noting how high the pendulum rose on the next swing. 
Newton himself conducted experiments with colliding pendulums carried out 
with special care to take into account the effects of air resistance, and gives a 
full account of them in the Principia (Newton [2; pp. 425—-427].) 


Newtonian Mechanics 23 


Thus, Newton took a basic experimental fact, the conservation of momentum 
in collisions, and recast it as a corollary of an apparently equivalent formulation, 
Newton’s ‘Third Law. Nowadays, physics texts may allude to the essential role 
played by the third law—“forces always appear in pairs” —but they give scant 
attention to the fact that Newton’s decision to cast those experimental results 
in terms of the third law was an incredibly audacious generalization! Based 
on results involving the completely unknown repulsive forces between colliding 
bodies, Newton hypothesized a more general law concerning al forces. 


And, despite the snide remark offered previously concerning non-scientists’ 
misunderstanding of the third law, it may be said with equal justification that 
the significance of the third law is almost completely obscured in elementary 
physics classes, where it is usually regarded as no more than a tool for solving 
problems—“draw a force diagram, remember that if A exerts a force on B, 
then B exerts an equal and opposite force on A, blah, blah, blah, ... ” 


In this regard, it is instructive to consider a little experiment measuring the 
attraction of a magnet and a piece of iron. In the 18" century manner, we 
might employ the magnet and iron as the bobs of two pendulums, but let’s take 
advantage of modern technology and simply place them on a surface with dry 
ice, or on an alr-trough. We start by holding the magnet and iron fixed—so that 
the total momentum is O—and then release them at the same moment. As soon 
as we do this, the iron starts moving towards the magnet (the magnet attracts the 
iron), and likewise the magnet starts moving in the opposite direction, toward 
the iron (the iron attracts the magnet). With modern technology it is easy to 
make accurate measurements of their positions over very small intervals of time, 
and thereby determine their velocities at these times, giving us an enormous 
amount of data to show quite convincingly that we do indeed have conservation 
of momentum. 


Even without this modern technology, we can still make a convincing case, 
on the basis of one crucial observation: when the magnet and iron collide, they 
come to a dead stop; no matter the relative size of the magnet and iron, the total 
momentum at the moment of collision, and thereafter, is 0. So it must have 
been very close to 0 just before the collision. But we can obtain a whole range 
of velocities just before collision simply by varying the initial distance between 
the magnet and iron—or, equivalently, by varying the strength of the magnet. 
Thus the total momentum must always be zero! 


Our experiment could naturally be carried out using two magnets, but we 
specifically chose a magnet and a piece of iron to emphasize the non-intuitive 
nature of the third law: after all, just because the magnet pulls the iron, why 
should the iron be pulling the magnet (let alone with exactly the reverse of the 


24 Chapter I 


force with which the magnet is pulling the iron)? In everyday life, we seldom give 
much attention to this second force. If we take one of those cute little magnets 
that are used to keep pieces of paper on refrigerator doors and hold it close 
to the refrigerator, we usually don’t notice the refrigerator moving toward the 
magnet! What we do notice is the magnet being pulled toward the refrigerator 
door; nevertheless, we tend to think of this as a property of magnets, not as a 
property of refrigerator doors, hardly ever thinking of the refrigerator door as 
exerting a mysterious force. 


As it happens, our modern conceptions of magnetism explain rather nicely 
the mutual attractions of a magnet and a piece of iron: the individual atoms 
of the iron each act as magnets, except that they are oriented randomly, and 
the magnet causes them to align, so that the iron now acts as a magnet also. 
Ultimately, it’s all a matter of iron atoms attracting each other. 

Siumuarly, when an object with a static electric charge is used to give a similar 
charge to a second object, we would expect to find (repulsive) forces of equal 
magnitude, since ultimately it’s all due to the mutual repulsive forces between 
electrons. Of course, we also find (attractive) forces of equal magnitude between 
objects with opposite charges, an example of that mysterious and pervasive 
duality one finds throughout nature. 

This seems to leave the least esoteric example—the equal (repulsive) forces 
between colliding bodies, no matter how different their composition—as the 
most difficult to understand! Eventually, this must come down to the equal- 
ity of certain forces between atomic particles. Indeed, the non-intuitive nature 
of the third law is highlighted when we consider the most important of these 
atomic particles, the protons and neutrons. We naturally expect that the third 
law should apply to the reactions of two protons and to the reactions of two 
neutrons, since these are both completely symmetric situations, involving iden- 
tical particles. But that doesn’t explain at all why the forces between a proton 
and a neutron should exhibit this same sort of symmetry. In other words, in 
order to really understood the third law, it must be necessary to understand all 
sorts of nuclear physics (thus I was quite delighted to learn that a proton is made 
up of two “up” quarks and one “down” quark, while a neutron is made up of 
two “down” quarks and one “up” quark, even though I haven’t the slightest 
idea what quarks are). 

Finally, let us return to our first example of the third law, the gravitational 
force that the earth exerts on an object. In this case the reciprocal force of the 
object on the earth is essentially negligible, and the same observation prevails 
when we are considering the earth, or some other planet, in relation to the sun: 
for all practical purposes we might just as well consider only the gravitational 
force that the sun produces on the planet, and ignore any force that the planet 


Newtonian Mechanics 25 


produces on the sun (Jupiter is something of an exception). The situation is 
quite different for the case of the earth and the moon, however, and this case 
was of enormous interest for Newton, who made it the subject of the Principia’s 
final Book 3, The System of the World. 

The force of gravity, unlike the repulsive forces between colliding bodies, 1s 
an attractive force, and it is also a much weaker force; the gravitational attrac- 
tion between two laboratory-sized objects is so minute that it was impossible for 
Newton to provide direct experimental evidence for the third law in the case of 
gravitational forces (see also Chapter 7 in this regard). Instead, this evidence is 
ultimately provided by Newton’s analysis of the earth-moon system, and the re- 
sulting tides, and the accuracy with which it agrees with observation. As for the 
question of why the third law should hold for gravitational forces, the ultimate 
answer to that question presumably lies in the theory of General Relativity. 


Though it might seem that we have provided far too extensive a discussion 
of the third law, which merits no comparable attention in physics books, there 
is an equal and opposite reaction, that physics books provide far too little dis- 
cussion. One might well wonder why critical readers readily accept so general 
a law buttressed by so little experimental evidence, as if it somehow expresses a 
morally compelling symmetry. Perhaps it’s just because it’s so easy to confuse 
laws of nature expressing the symmetry of space with other laws that merely 
seem to. This will be discussed in greater detail in relation to Lagrangian me- 
chanics (at the end of Chapter 13), but the role of symmetry in mechanics bears 
examination even in our current setting. 


The lures of symmetry. Our statement of the second law, as F = my’, contains 
within it an assertion specifically enunciated in Newton’s version: A change in 
momentum is proportional to the motive force impressed and takes place along the 
straight line in which that force 1s impressed. ‘The second phrase would probably be 
taken for granted even if Newton had not mentioned it. After all, what other 
direction could it have, other than the reverse direction? If we assume that 
some specific direction must be determined, and also assume that the laws of 
physics must be invariant under rotations, it would appear that this is the only 
possibility. 

Similarly, in the third law it is usually taken for granted that the equal and 
opposite forces of reaction between two point masses are directed along the line 
between them. This more specific statement, often referred to as the “strong 
form” of the third law, is needed for later theorems in mechanics found in 
Chapter 3, and plays a crucial role in the analysis of rigid bodies in Chapter 5. 
It should be noted, however, that Newton didn’t assert this stronger version of 
the third law—note that it wasn’t used in proving conservation of momentum 


26 Chapter I 


for the collision of two bodies—and that Newton never took up the topic of rigid 
bodies in the Principia. Moreover, to be candid, it must be remarked that this 
conclusion isn’t even true without quite a bit of amplification and modification, 
as briefly indicated in Addendum 5A. 

This leaves the other part of the third law, that the forces of reaction between 
two point masses are always of the same magnitude, though in opposite direc- 
tions. Like Euclid’s notorious Fifth Postulate, this assertion has a long history of 
arguments designed to demonstrate it. In fact, a wonderfully ingenious demon- 
stration was advanced by Huygens, in his arguments concerning the rules of 
collisions that were referred to in Newton’s Scholium (page 22). Here is a slight 
modification and simplification of his arguments. 

First consider two identical bodies, say two steel balls, moving toward each 
other with equal speeds, 1.e., with velocities v and —v. In this simple situation it 
is obviously reasonable to assume, on the basis of symmetry, that their rebound 
velocities will also be negatives of each other, w and —w, so that conservation 
of momentum holds: it is 0 both before and after the collision. 

Now let us magine the same experiment as observed in a coordinate system 
that is moving with uniform velocity u with respect to us, like a boat moving 
with respect to the shore, to take Huygens’ example. In this coordinate system, 
the objects are moving with the initial velocities 


V=vt+u and Yo =-v+u, 
while their rebound velocities are 
Wi=>w+tu and W. => -w+u, 


SO Vy + V2 = Wi + W2 (= 2u). Since we can obtain any pair Vv), V2 by choosing 
the appropriate u and v, we find that in the coordinate system of the boat, 
moving uniformly with respect to the shore, conservation of momentum holds 
for two identical bodies approaching each other with arbztrary velocities. Of 
course, we could just as well interchange the role of the boat and the observer 
on shore, to reach the same conclusion for our observer on shore. 

Rather than following the succeeding course of Huygens’ arguments (see 
Problem 3-9 and remarks in Chapter 7), we will add some considerations from 
Volume | of 


Feynman [1], Zhe Feynman Lectures on Physics, Addison Wesley, 1963. 


an essential book in the library of any one interested in physics. 
Let’s use steel cubes for convenience, and suppose that glue has been applied 
to opposing faces so that they will stick together when they meet. Symmetry 


Newtonian Mechanics 27 


dictates that when they approach each other with the same velocity and then 
stick together, they will end up at rest (compare page 23), so that conservation of 
momentum holds. Huygens’ argument then implies that a collision with initial 
velocities Vj and V2 results in a “double cube” moving with velocity +(vj +V>). 

We can apply these results to the case of one cube with velocity v; colliding 
with two other cubes that have glue applied to opposing faces, but are moving 


V2 
<< 
V1 V2 
—————_ << 


in tandem, separated by a tiny distance, with velocity v2. Immediately after 
cube 1 and cube 2 collide, they have velocities w; and wy» satisfying 


(a) Ww, +Wo=—V14+ Vo. 


A moment later, cube 2 collides with and sticks to cube 3, after which the 
resulting double cube moves with velocity wi2 satisfying 


(b) 2Wi2 = W2+ Vo. 
It follows that 


initial total momentum = my, + 2myv> 
= m(Vv1 + V2) +mv2 
= m(wWi + W2) + mv2 by (a) 
= mw, + M(W2 + V2) 
= mw; + 2mwj2 by (b) 


= final total momentum. 


Imagining the tiny distance decreased to 0, we conclude that conservation of 
momentum holds for a collision of a cube with a double cube, and we can easily 
generalize the argument for any multiple cube colliding with any other. 

This clever argument might require supplementary considerations to deal 
with steel cubes stacked differently, let alone with objects of arbitrary shape, 


. | jn 
2 
v1 
<< &) 
¥9 
" 
————- 


28 Chapter I 


but the real problem is that it applies only to two objects made of the same 
“homogeneous” material. As soon as we consider objects made of different 
materials we are at an impasse. Given a steel cube and an aluminum cube of 
the same mass, approaching each other with velocities v and —v, no symmetry 
argument allows us to conclude that the rebound velocities are also negatives 
of each other; having the same mass simply means that they are given the 
same acceleration by a given force, it says nothing about why they should react 
symmetrically in this situation. And that is the whole mystery of the third law. 

As Feynman also points out, one cannot somehow get around this problem 
simply by defining two bodies to have the same mass if they rebound with equal 
speeds in such an experiment, because then there is no logical reason why 
“having the same mass” should be a transitive relation. 


Composition of forces. Newton’s statement of the conservation of momentum 
is actually presented as Corollary 3 of the three laws of motion. Before proving 
this Corollary, Newton has proceeded in the classical manner of Euclid, with a 
dubious proposition, providing, as Corollary | to his axioms, the rule for deter- 
mining the effect of two forces Fy and F2 acting on an object simultaneously. 
Nowadays the vector space structure of R” is so ubiquitous and appears 
so natural that we might unhesitatingly aver that the effect must simply be 
the standard vector sum F, + F2, constructed geometrically by the familiar 
parallelogram construction. But that can’t simply follow from the fact that we’ve 





decided to represent forces by vectors! In fact, the whole reason for introducing 
vector addition in the first place was because it represented the “addition” of 
forces. 

Modern mathematicians have no trouble accepting the fact that Euclid’s 
“proof” of the side-angle-side theorem is irreparably defective, and that the 
statement must be taken as a basic axiom. Even in physics, where the logi- 
cal underpinnings of concepts is never so removed from its actual application 
in the real world, it should be clear that the parallelogram rule for combining 
forces can’t possible be proved on the basis of the three basic laws, since none of 


Newtonian Mechanics 29 


them say a thing about two forces acting at once.! That hasn’t prevented many 
eminent physicists and mathematicians from attempting to fashion a proof, but 
consideration of those attempts, requiring such involved and ultimately futile 
maneuvers, has been relegated to Chapter 7. 

Physicists nowadays seem resigned to the stance of regarding the parallelo- 
gram law as just another law based on observation, and mechanisms like the 
one pictured below may be used to illustrate it in classroom settings. Strangely 


F; = —(F, + F)) 


at equilbrium 





enough, however, if the parallelogram law really is an experimental fact, then 
one would expect physicists to be testing it with great precision, but no one 
every mentions such experiments! 

I imagine physicists would say that the parallelogram law of forces 1s demon- 
strated by the consistency with which its many uses intermesh, but I suspect 
the real reason no one bothers to test this law is because everyone thinks that it 
really has to be true, as Chapter 7 so eloquently testifies. 


With a law for the composition of forces at hand, we can now consider 
the total momentum of a general system of particles c1,...,cK, with masses 
m,,...,mK. Assuming there are no “external” forces (like the force of grav- 
ity), but only the “internal” forces F;; that the particle c; exerts on c;, with 
F;; = —Fj;, in accord with the third law, the total force on particle cj is )), Fij, 


and consequently 
> mivi' = (0 Fy) =0, 
t J 


l 


which means that }°, m;v;, the total momentum of the system, must be a con- 
Stant. 


'From a strictly logical point of view, it is not even clear that two forces F; and F, 
acting simultaneously should have the same effect as any other single force F: while it’s 
true that the combined forces must end up producing an acceleration of some sort on 
each object, that acceleration might not be proportional to the mass of the object, even 
though the accelerations produced by F, and F individually are. 


30 Chapter I 


It should also be noted that the parallelogram law is not only used to de- 
termine the combined effect of two forces, but just as frequently it is used to 
decompose a given force into two hypothetical ones. For example, the usual ele- 
mentary way to analyze the motion of a block sliding down a stationery inclined 
plane is to decompose the force F of gravity on the block into a force F2 par- 
allel to the inclined plane, and a force F, perpendicular to the inclined plane. 
Letting a be the angle from the horizontal to the incline of the plane, we see 
that the magnitude |F2| of F2 is sina - |F. 





With this decomposition, the force of the block on the inclined plane is Fj, so 
the inclined plane must exert a force of —F, on the block, by the third law; the 
net effect is that the total force on the block is simply F2, from which we can 
determine the motion of the block explicitly—it undergoes uniform acceleration 
that is sinq@ times its acceleration in free fall. 

Of course, this all depends on the slight-of-hand introduction of a force can- 
celing F,, with the implicit assumption that the force Fz has no effect on the 
incline plane. We will discuss the justification for such machinations in a little 
greater detail in Chapter 6, but for now we simply note that they are com- 
monplace in elementary mechanics courses, and Newton had no aversion to 
them. 

As a matter of fact, right after his proof of the parallelogram law of forces, 
which is Corollary | of the three laws, Newton states Corollary 2: 


And hence the composition of a direct force AD out of any oblique forces AB 
and BD 1s evident, and conversely the resolution of any direct force AD into 
any oblique forces AB and BD. And this kind of composition and resolution 


2s indeed abundantly confirmed from mechanics. 


To iwlustrate, Newton then launches a long investigation of a mechanical sit- 
uation. A replica of the rather complex diagram that appears in the Principia 
is shown on the opposite page, but we won’t reproduce his discussion. We 
merely mention that along the way he proves the law of the lever, and if you 
believe that his proof can be valid, then you can go read it yourself (Newton [2; 


pp. 418-420] as well as Chandrasekhar [2; pp. 24—25]). 


Newtonian Mechanics 31 





Newton, in a couple of pages, has offered a Student Guide to solving me- 
chanics problems: “the whole of mechanics—demonstrated in different ways 
by those who have written on the subject—depends on what has just now been 
said.” But Newton doesn’t intend to add much more, explaining near the end 
of the Scholium, “But my purpose here is not to write a treatise on mechanics.” 

Our purpose, on the other hand, zs to write a treatise on mechanics, so we 
will soon be parting company with Newton. Before doing that, however, in 
the next chapter we will discuss Book 1, “The Motion of Bodies”, which comes 
immediately after the “Definitions” and “Axioms, or the Laws of Motion’, to 
see one of the reasons why Newton dd write the Principia. 


32 Chapter I 


ADDENDUM IA 


IT ISN’T ROCKET SCIENCE 
(Why Easy Physics is So Hard: J) 


School children are customarily told that rockets work because of the third 
law, instead of being provided with the more prosaic explanation that the burn- 
ing fuel explodes both toward the front of the rocket and toward the rear, with 
the fuel that explodes toward the front pushing the rocket forward, while the 
fuel that explodes toward the rear simply escapes out the open end of the rocket. 

As a matter of fact, the action of a rocket doesn’t follow so clearly from 
the third law itself as from it’s consequence, the conservation of momentum: 
since the burning fuel has momentum in one direction, the remaining fuel and 
rocket must have momentum in the other direction in order to conserve the total 
momentum—namely 0 if the rocket starts at rest. As this example illustrates, 
one of the main attractions of “conservation” laws is that they often allow us to 
consider quite complicated situations involving myriad particles at once. 

Of course, a few idealizations are required here: rocket fuel is usually a liquid, 
but it is not unreasonable to regard it as a bunch of particles (which, at the 
atomic level, it really is); and the (empty) rocket itself is hardly a particle, and 
we might demand a more careful analysis of how the force exerted on one part 
gets transmitted to other parts, but we will defer that to a later chapter. 

Now it would seem fairly straightforward to get an analytic expression for the 
motion of a rocket. We'll first consider a rocket in empty space, so that there 
is no external force acting on it. Let v be its velocity, and let m(t) be the mass 
at time t, by which we mean the constant mass of the empty rocket plus the 
mass of the fuel still in the rocket at time t. Equivalently, —m’ = yu 1s the rate at 
which the fuel is burned; this depends ultimately on the chemical characteristics 
of the fuel, the design of the combustion mechanism, etc., etc. We also need to 
consider the velocity q with which the fuel is ejected from the rocket; we'll use q 
for the velocity weth respect to the rocket, so that q + Vv 1s the velocity with respect 
to our inertial system. If we ignored q, then we could just as well assume that 
the fuel was simply being dumped off the rocket (q = 0), which wouldn’t result 
in any motion at all! 


v(t) 


q(t) ———* 
<< —_ 
eS 
¢ een 


The simplest analysis proceeds from the prosaic point of view suggested at 
the beginning. In a short time interval [t,t + h], the amount of fuel ejected is 


Tt Isn’t Rocket Science 33 


m(t) —m(t +h), and therefore the momentum of expelled fuel will be close to 
[m(t) —m(t +h)|-q(t). Thus, the momentum of the fuel in the other direction, 
pushing the rocket forward, will be the negative of this. So the force on the 
rocket must be the derivative, m’(t) - q(t), and by the second law this means 
that 


(*) m'(t)q(t) = m(t)v(t). 
This can also be written as 
m'(t) 


v(t) = 





d 
th=—l t))q(t). 
ny ae oe )q() 
In our example, v and q always point along the same straight line, and if we 
let v and q denote their lengths—the speed of the rocket and the speed of fucl 
ejection—then, remembering that q and v point in opposite directions, we can 
simply write 


m'(t) 
m(t) 


This argument might seem suspect, since we appear to be working in a co- 
ordinate system based on the rocket, which is not an inertial system, but that 
isn’t really the case. Although we derived (*) from the “rocket’s point of view’, 
at each particular tume fo we were essentially working in the inertial system that 
is moving with the same velocity as the rocket at time to, and since the derived 
equation (*) involves only the change v’(to) of velocity at time fo, it holds just as 
well in the inertial system where we are making our measurements of position. 
Nevertheless, most physics books avoid tackling an explanation of this sort and 
instead present the following analysis. 

Since the ejection velocity of the fuel with respect to our inertial system is 
q-+v, ina small time interval [t, +h], the amount of fuel ejected, m(t)—m(t +h), 
has a velocity close to v(t) + q(t), so the total momentum of this expelled fuel 
is close to 


(R) v(t) = <q) = 4) slog(m (t)). 


[m(t) —m(t + h)]- (v(t) + q(2)). 


The derivative at time t of the momentum of the expelled fuel is 


 m(th—m(t +h) 
lim ———S—S SSS 
h—-0 h 


-(v(t) +.q(t)) = —m'(t) - (v(t) + q(@)). 


Setting this equal to the derivative of —m(t)v(t), we get 


d 
= lm@)v@)] = m'(t)- (vO) +4), 


34 Chapter I. Addendum 1A 


which can also be written as our original equation 
(*) m'(t)q(t) = m(t)v'(¢). 


The funny thing about this problem is that we tend to think of it as a “real-life” 
problem, involving a continuously changing fuel mass, and then find ourselves 
in the position of having to use laws that apply only to individual particles. 
But if we made our “real-life” problem really real, by considering the fuel as a 
collection of particles being ejected individually in tiny increments of time, then 
we would view our rocket as receiving tiny changes of velocity at these times, 
but moving with constant velocity in the intervals between. In other words, our 
rocket is an inertial system on these intervals, which makes the validity of the 
first argument much more transparent. 


A completely independent source of confusion is offered by some less recent 
mechanics texts, which like to point out that in special relativity, which we will 
barely mention in this volume, the mass m of even an individual particle is not 
constant, but depends upon its velocity; it then turns out that the second law, 
which we have always stated as 


F = (mv)’ 
= my’ 
has to be corrected to read 


F = (mv) =m'v + mv’. 


Of course, the mass of a particle doesn’t vary with velocity in classical mechan- 
ics, but when we bring an external force F into the picture, we might wonder 
whether this more general equation is the right one to use for a rocket, with 
variable mass m(t). If F is an external force on the rocket, e.g., gravity, should 


we use 
(1) F(t) = m(t)v'() 
or 

(2) F(t) = (mv)(t)? 


The short answer is: neither. When F = 0, equation (1) contradicts (*) unless 
q = 0; this is hardly surprising, since F is simply the “external” force on the 
rocket, and we still have to account for the force exerted by the escaping fuel. 
And (2) likewise contradicts (*) when F = 0 except in the special case where 
q = —V. 


It Isn’t Rocket Science 35 


In fact, using the same reasoning by which we established (*) we can con- 
clude more generally that when an external force F acts on the rocket we have 


F(t) = m(t)v'(t) — m’(t)q(t) 


F(t) = (mv)(t) — m'()[v(t) + q(0)]. 





A more complete answer is that the question is completely misleading. It pur- 
ports to be studying an object, “the rocket”, that has a variable mass. But the 
objects that we really have are the empty rocket, together with a myriad of par- 
ticles of rocket fuel, some of which are moving along with the empty rocket, 
while others are moving in the opposite direction. 

At any time ¢ the force of the burning fuel acts on the empty rocket together 
with the part of the fuel still moving along with it. By additivity of mass, this 
composite object is given the same acceleration as a single object whose mass 
is what we have called m(t), and which we misleadingly called the “mass of the 
rocket at time ¢”’. 

In any case, the formula (**) or (**’) states the proper result and this analysis 
applies just as well to any “variable mass” problem, where a body’s mass is 
changing as particles are continually dispersed, or are added on, with velocity q 
(or velocity v + q with respect to an inertial system). 

Unfortunately, some textbooks like to claim that the proper law is, in fact, (2), 
perhaps under the sway of the influential text Mechanics by Sommerfeld [2], 
where Newton’s statement of the second law is specifically identified as (2), with 
the remark that “Newton’s formulation ... prophetically turns out to be the 
correct one.” 

Naturally, considerable backtracking is required when (2) has been perpe- 
trated as the proper relationship, as shown, for example, by the thorough, and 
thoroughly confusing, discussion in §I.4 of Sommerfeld’s book. 

‘lo make matters worse, one class of variable mass problems (**’) reduces 
to (2)—namely, those in which v + q = 0. For example, consider a satellite 
moving in empty space uniformly filed with stationery interplanetary debris 
that sticks to the satellite when it hits it, thereby increasing the satellite’s mass 
m(t). In this case, (2) is applicable because it amounts to (**’) with v-+q =0, 
which merely says that the accumulated debris is initially at rest before it starts 
moving along with the satellite. (Since our analysis was made in terms of mass 
removed from a body, it might help to think of the reverse time picture, where 
bits of the satellite are being expelled with exactly the velocity that they need to 
become part of the stationery interplanetary debris.) 


36 Chapter 1. Addendum 1A 


A similar situation, where again the proper equation (**’) reduces to (2), 
occurs for a raindrop falling through an atmosphere saturated with water vapor, 
and accumulating mass by condensation (Problem 12). 

A typical contrasting case is illustrated by a fine chain with very small links 
lying on a table, with a small piece hanging over the edge, initially held at rest 


and then released (Problem 13). Now the added particles, the small links, are 
already moving with the velocity of the hanging piece, so we have q(t) = 0, 
and equation (**) instead reduces to (1). 

Further consideration of these examples will be found in the Problems, and 


in Addendum 3A of Chapter 3 (page 100). 


Weight Versus Mass 37 


ADDENDUM 1B 
WEIGHT VERSUS MASS 


Newton had good reason for wanting to check the proportionality of mass to 
weight with a high degree of accuracy, since it is crucial to his “universal law of 
gravity’, that the force F between two bodies of masses m; and mz, separated 
by a distance d, has magnitude 
m m2 


d> ° 





|F| =G 
where G is some “universal constant”.! 

In fact, at the very beginning of the Principia, Newton states, right after 
the definition of mass, “It can always be known from a body’s weight, for—by 
making very accurate experiments with pendulums—I have found it to be pro- 
portional to the weight, as will be shown below.” ‘The experiments in question 
are completely different from those mentioned at the bottom of page 22, and 
Newton’s reference to these other experiments must have caused many a reader 
to scurry vainly through the succeeding pages of the Principia, because the pre- 
sentation of these particular experiments actually occurs several hundred pages 
later in the Principia, in Proposition 6 of Book 3! 

To discuss Newton’s experiments, we will use the result of Problem 17(c), 
showing that the periods T; and 72 of a pendulum bob undergoing different 
accelerations g, and g>2 stand in the relation. 


g2 _ Ti 
g1 72? 
We are trying to compare the accelerations g; and gz on two objects, of 


masses m, and m2. Denoting their weights by Wi = gim, and W2 = gomz, the 
above equation can be written 


> — Bie 
(*) ms We Te 


and in particular, if W; = W2 we have 


ay 


my _ W, T;7 


my T |? 
mz 7122 


' More precisely, the factor m2 involves the proportionality of a body’s mass m and of 
the force exerted on it by another body, while the fact that the masses of both bodies 
enter symmetrically, as the double factor m mz, 1s basically equivalent to the third law 
for gravitational forces; as we shall see in the next chapter, the factor 1/d? is consistent 
with the elliptical orbits of planets, and the fact that G is the same for all bodies is 
consistent with Kepler’s third law. 


38 Chapter 1. Addendum 1B 


The result (*) occurs as Proposition 24 of Book 2 of the Principia, Newton [2; 
pg. 700]; presumably Newton’s proof —with nary an equation, or even an al- 
gebraic symbol, in sight—1is equivalent to that in Problem 17. 

Newton tested (**) with equal weights of “gold, silver, lead, glass, sand, com- 
mon salt, wood, water, and wheat”. Each pair of materials to be tested was 
enclosed within one of two rounded, equal-sized wooden boxes. For the wood 
bob he simply filled the inside of the box with more wood, but for the gold 
bob he suspended the gold at the center of the box; he then hung each of the 
two boxes by eleven-foot cords, which “made pendulums exactly like each other 
with respect to their weight, shape, and air resistance.” 

He then placed them close to each other, and started them swinging from 
the same height, noting that “they kept swinging back and forth together with 
equal oscillations for a very long time. ... And it was so for the rest of the 
materials. In these experiments, in bodies of the same weight, a difference of 
matter that would be even less than a thousandth part of the whole could have 
been clearly noticed.” Newton doesn’t give any further details, but it should be 
noted that Problem 17 (a) shows that an inaccuracy in measuring the lengths of 
the pendulums would produce errors of the same magnitude as those we are 
trying to eliminate. 

In 1832 Bessel made improved pendulum experiments that established pro- 
portionality within an error of 2 parts in 10° (cf. Problem 6-3). But the most 
accurate experiments depend on an ingenious idea first introduced by EGtvés 
in 1889, which enables us to consider only static measurements. 

Although a suspended weight may seem to be hanging vertically in the labo- 
ratory (a), because of the earth’s rotation it is really traveling along a circle (b), 





F) 
Lh 
(a) (b) (c) 


and its natural tendency to travel in a straight line means that it acts as if there 
were a small “centrifugal force” pulling it away from the the axis of rotation. So 
the weight is really hanging at a slight angle (c), whose direction is compounded 
from the downward force F, of gravity on the object, and the acceleration F 
from its circular motion (centrifugal force is discussed in Chapter 10, where 
detailed computations of the angle can be found, on page 380). 


Weight Versus Mass 39 


The angle of F2 1s determined by the latitude of the laboratory, and its length 
depends—aside from the rate of revolution of the earth, and the latitude and 
radius of the earth—only on the mass of the object. On the other hand, the 
force F; depends on the weight of the object. Consequently, if two objects have 
differing ratios of mass and weight, the angles at which they are hanging will 
be slightly different. So if they are suspended from opposite ends of a bar 


attached to a sensitive torsion balance, they ought to twist the torsion fibre in 
one direction. ! 

Of course, this twist would be quite small, and various side effects might pro- 
duce a twist that overwhelmed it. But E6tvés was able to look for the existence 
of a net twist by enclosing the whole apparatus within a chamber that could be 
rotated through 180°, which would reverse the twist due to differing ratios of 
mass and weight. As a result of his experiments, E6tvés concluded that weight 
and mass must be proportional within an error of 1 part in 107. 

One crucial point about the Eétvés experiment is that the tested objects need 
not have exactly the same mass: so long as their masses are close, a difference 
in the ratios of force to mass would produce a net twist. ‘Thus, as in Newton’s 
pendulum experiment, we can show proportionality of weight to mass to a high 
degree of accuracy without having to measure mass itself accurately at all. 

At the other extreme from the naive belief that heavier bodies fall faster than 
lighter ones, we have Ejinstein’s extremely sophisticated view regarding the pro- 
portionality of mass and weight, that the identical acceleration experienced by 
all bodies must indicate that this acceleration is not really due to a force at all. 
This led to the general theory of relativity, which interprets free fall trajectories 
as natural geodesics in a curved space-time. 


| The use of a torsion balance goes back to Henry Cavendish (1731-1810), who used it, in 
one of the first physics experiments performed to a high degree of accuracy, to measure 
the extremely small gravitational force between two lead spheres. He expressed the 
result in terms of the density of the earth, though his neighbors always described the 
building where this was done as the place where the world was weighed. 

By the way, as far as I know, no one has ever conducted an experiment of this nature 
to test Newton’s third law for gravitational forces. 


40 Chapter 1. Addendum 1B 


This foundational role for the proportionality of mass and weight has inspired 
even more refined experiments of the EGtvés type, starting with the 1964 paper 
of Roll, Krotkov, and Dicke [1], which verified proportionality within 1 part in 
101! for gold and aluminum. 

The book Gravitation, by Misner, ‘Thorne, and Wheeler [1], gives a brief de- 
scription of the experiment, and points out how much care is required to rule 
out extraneous influences. Many more details are provided by the paper itself, 
which undoubtedly can serve as an inspiration for any aspiring experimental 
physicist. 


Newtonian Mechanics 41 


PROBLEMS 


1. The purpose of this first problem is simply to remind us once again how 
slippery elementary physics problems can be. 

A man pushes against a cart with his hand, and the cart, initially at rest, starts 
to move. So his hand must be exerting a force on the cart. By the third law, the 
cart must be exerting a force on his hand in the opposite direction. How can 
that be, when his hand is obviously not accelerating in the opposite direction? 
(Compare the situation where the man is thrown against the cart with his hand 
outstretched.) 


‘MOU 14911 s10]dx9 },upsou ATNFTUeY) 9M s[reJap ssoyM 
urajshs poyeorduros A[qipestoul ue savy am “ayjasoIpy “(729 ay) ysnd 0} porn ay usyM 
spremyorg dis ysnf pnom ay astmioyj0) UOT sey JOOTJ ay) pue “(saysnd sy Moy s.yey}) 
LOOT} IY} UO VIOJ 1.19XK9 199F SIY YY} OV] OY} O} ON SI VdIOJ SIT, “J99F SIY UO $j19xKO IOOTF 
ay} yey) Vo10} oy) Aq Apog sty YSsno1y) poywsuen 41 UO sodIOJ JOYIO sey pueY SIP] 


2. (a) Consider two objects By and Bz of equal mass m. ‘The object Bo 1s 
stationary, while B; is moving towards Bz with velocity v. Suppose that after 
the collision, By and Bz have velocities w; and wy? that are collinear with v and, 
moreover, point in the same direction as v. Show that |w;| < |v|/2. (This is a 
very simple physics problem, not a mathematics problem!) 

(b) Suppose that a system of particles c1,...,cK with masses m1,..., mK Satis- 
fies conservation of momentum, i.e., )); mvj’ = 0. Show that there are forces 
Fj; = —Fj; such that mjv;’ = )/, Fij. (This is a simple mathematics problem, 
not a physics problem.) 

(c) Gan we always choose the F;; so that they satisfy the “strong form” of the 
third law (page 25)? 


3. Nowadays, rather than working from impulsive forces to continuous ones, 
we are more apt to reverse the process. Suppose that on the time interval [0, h] 
we have a force acting along the x-axis with magnitude f(t) = 0 at time f, 
and f(0) = f(h) = 0. For a particle of mass m starting at rest, so that its 
distance x(t) from 0 satisfies 0 = x(0) = x’(0), show that x’(h) = F = fe f. 
So we can basically think of an “impulsive” force at time 0 as one that suddenly 
,, changes our particle from one at rest to one with velocity F. 





(a) Here is the precisely stated part. Let cy and cz be two particles, of masses m 
and m2, let Fj2 = —F2; be the “internal” forces between them, and let F be 


a 
Ccje——> @f2 
12 


42 Chapter 1 


an external force on c; that always points in the direction from c, to C2, so that 
under these combined forces the particles move along a straight line. 

Now suppose that cy; and cz always remain the same distance apart. Find F,2 
in terms of F. (In particular, if F is a non-constant force, then F12 must also 
vary with time, even though the distance between c; and cz remains the same. 
Since we usually think of the internal forces as functions of the distance between 
the particles, rather weird internal forces would thus be necessary for truly rigid 
bodies, which are the subject matter of Chapter 5.) 

(b) Here is the version of this problem as it appears in elementary physics books. 
Two blocks B,; and B2, of masses m, and m2, are in contact on a frictionless 


mie 


table. A horizontal force F 1s applied to B,. Find the forces between B, and Bz 
in terms of F. 

In this form, the problem appears to assume tacitly that the blocks remain 
in contact. Suppose the force is delivered as the result of another block B of 
mass M colliding with B,. The fact that the system of all three blocks satisfies 


conservation of momentum does not limit us very much: show that B; and Bz 

can move in such a way that they remain in contact, and also in such a way 

that they do not. If we consider F to be an “instantaneous” force applied at 
14, some time fo can we find Fj2(to)? 





“es 9. (a) For circular motion with constant angular frequency, 

c(t) = R(cosa@t, sin at) 
calculate that the magnitude of the acceleration can be written as v?/R, where v 
is its constant speed. 


(b) More generally [and more easily proved]: for any motion on a sphere of 
radius R, the inward component of the acceleration at time ¢ has magnitude 


v(t)*/R. 


Note, by the way, that part (a) can easily be proved geometrically: 


v(t) — v(to) 


x \eO- eto) 


R 


Iv(t)—v(to)| _ |e(t) —c(to)| 
v — R 





Newtonian Mechanics 43 


6. Suppose we have an accurately calibrated spring scale and we weigh an 
object in an elevator that has an acceleration a. What will the scale read? 
Distinguish between the cases of an upward acceleration and a downward ac- 
celeration. 


7. Here is a problem that might appeal to bright physics students, especially 
those interested in experimental physics; I wouldn’t even know where to begin. 


(a) Dr. Fignewton publishes a paper in the Journal of Irreproducible Results in 
which he claims to have performed experiments suggesting that the second law 
should actually read 


F =m(v’ +av”) 


for a very small “universal constant” @ ~ 0.000327... . Devise an experiment 
of sufficient accuracy to refute this. 


It would appear that one must investigate motion under some non-uniform 
force F, but it is important to make sure that the argument doesn’t beg the 
question, by implicitly measuring F through an appeal to the second law. 


(b) Devise an experiment to verify the parallelogram law of forces within 1 part 


in 10!! (all right, I’m actually willing to settle for 1 part in 10°). 


8. In the theory of special relativity, the mass m of an object moving with speed v 
is related to its “rest mass” mo by the equation 


(a) m = mo(1 — v2/c?)1/?, 
and the second law is modified to 
(b) F = (mvy' 


(so that the argument on page 22 can still be used to show that momentum is 
conserved). Use (a) and (b) to derive the formula on page 17. 


For the following problems, we adopt the usual notation that g denotes the ac- 
celeration produced by gravity on a body near the earth’s surface. Equivalently, 
gm is the magnitude of the force on the body of mass m. 


9. Assume that the earth is exactly spherical, and that a particle c, under the 
influence of the earth’s gravity, circles the earth just above the surface of the 
earth—also ignore air resistance, the moon and the sun, and so forth. Show 
that the speed of the particle is /gRe, where Re is the radius of the earth. 


44 Chapter 1 


10. (a) Consider a rocket in empty space, initially at rest, and assume that q 
in equation (R) on page 33 is constant. Let the mass of the fucl be r - m(0) 
(0 < r < 1). Show that the final velocity achieved by burning all the fuel 1s 
qiog1/(1 — r). Notice that this is independent of the particular manner in 
which the fuel is burned (i.e., independent of m’). 

(b) Suppose our rocket is initially at rest on the earth, pointing upwards, and 
for simplicity suppose that the downward acceleration due to gravity is the 
constant g. Show that if the final velocity is achieved at time fo, then this final 
velocity is [q log 1/(1 — r)] — gto (so the faster we burn the fuel the better). 


The next problem is a companion to this one. 


11. (a) A group of people, each with the same mass, stand at one end of a cart 
that can roll without friction, and jump off with the same speed. Show that if 
everyone jumps off at once, then the resulting speed of the cart is greater than 
the final speed the cart acquires if they jump off one at a time. (“Jumping off” 
should be regarded as involving an instantaneous force). 

(b) Analogously, consider the rocket of the previous problem, initially at rest 
in empty space, and suppose that all the fuel is ejected “instantaneously” with 
speed q. Show that the rocket achieves the final velocity gr/(1 — r), which is 
greater than the velocity g log 1/(1 —r) found in part (a) of that problem. How 
can these results be reconciled? 


12. Consider a falling raindrop that gains mass from the surrounding atmo- 
sphere saturated with water vapor, so that equation (2) on page 34 holds. 


(a) If the rate at which it gains mass is proportional to its surface area, say 
m’ = a - (surface area), then its radius r(t) satishes r(t) = ro + at, where ro is 
its radius at time 0, and its speed v(t) satisfies 


—3 
r r at 

vo) == (¢+ 2) +(vo-=-2) 1+ — ; 
4 a a 4 Yo 


where Ug is its speed at time 0. 

(b) Suppose instead that it gains mass at a rate proportional to the product of its 

surface area and speed v. Show that its acceleration asymptotically approaches 

a value. 

(c) Suppose that it gains mass at a rate proportional to the product of its mass 
_ and its speed v. Show that the speed v asymptotically approaches a value. 






— 13. Let the chain on page 36 have length /, mass M, and uniform density “4 = 
M/l, and let lo be the amount initially hanging down. Find the length x(t) 
hanging down after time ft, and the time at which the whole chain has fallen off 


Newtonian Mechanics 45 


the table. (Note that the whole chain is always being pulled, and recall that the 
equation f” = af has the solutions f(t) = e*¥“*.) (Compare Problem 3-23.) 


14. Physics textbooks like to consider a railroad cart rolling without friction, 
while its mass increases at a steady rate p because of sand from a hopper falling 


into it. ‘This variable mass problem can be considered to have v + q = 0, 
since we really only want to consider the horizontal components of forces and 
velocities. 


y 


(a) Suppose that the empty cart, of mass M, is at rest at time 0, and a constant 
horizontal force of magnitude F is applied as the sand starts to fall. Find the 
speed u(t) at time ¢, and the limiting velocity as t — oo. 

(b) Now consider the opposite case where the cart has an opening in the bottom 
that allows sand to pour out, decreasing the mass at the steady rate p. At time 0 


the cart is at rest, and the constant horizontal force of magnitude F 1s applied 
to it. If the sand has mass m at time 0, find the velocity of the cart when all the 
sand has poured out. 

(c) Finally, suppose that mass is increasing at the constant rate p; from sand 
pouring in the top, and decreasing at the constant rate p, from sand pouring 
out the bottom. What is the proper equation of motion? 


4,7 
N A 


“es 15. (a) Suppose that the railroad cart of the previous problem has been set 
moving with a constant speed v. Sand starts pouring into it in some arbitrary 
way, with m(t) being the total amount poured in by time t. What force F(f) 1s 
needed to insure that the cart will continue to have constant speed v? (Compare 


Problem 3-22.) 


16. A fine chain of length / and mass M, with uniform density 7 = M/1, 1s 
held at one end, with the free end just touching the floor, and then released at 


46 Chapter I 


time t = 0. The problem is to compute the force that the chain exerts on the 


floor, as a function of ft, or equivalently, as a function of the length x = 5 gt? 
of chain that has fallen. 


(a) A straightforward answer can be obtained in the same way as our rocket 
equation was derived. ‘The fallen part itself contributes a force equal to its 
weight 5g7t?, and we need to determine the force contributed by the falling 
chain, the end of which has speed gt. ‘The amount of momentum contributed 
by the falling chain in a small time interval h 1s close to (ugt-h)- gt. Conclude 
that the force of the chain on the floor at time ¢ is 

3 a 

gpa tS BUX. 

When the chain has just finished falling, this has the value 3gM = 3W, where W 
is the weight of the chain, although one would assume that it should simply be 
the weight W from that time on; we’ll worry about this discrepancy in part (c). 


(b) A nifty way to solve the problem directly from our general equation (*+) 
on page 35 is to consider the force F that the floor exerts back on the part of 
the chain lying on the ground, which we will consider as an object of variable 
mass m(t) = Sgt, which happens to have velocity 0 when acted upon by 
gravity and F. We have m’(t) = gt and gt is also the magnitude of q(t) in 
this situation. Use equation (**) to show that F(t) has magnitude 3y¢x. 

(c) This problem is usually stated in terms of the reading of a scale onto which 
a chain is falling (a), assuming, though not explicitly (unless the textbook writers 
have a conscience), that the mechanism of the scale is so tight that the pan 





Newtonian Mechanics 47 


stays at nearly the same height throughout. Of course, the pan actually dips 
a bit, and after the chain has finished falling the pan will suddenly pop back 
up, with the scale reading of 3W quickly correcting itself to W. Even when the 
chain is dropped directly on the floor, the floor will be pushed down a tiny bit, 
and eventually pop back up a bit at the end of the fall. ‘To get a more realistic 
problem, we need to consider a less realistic picture (b) where the weighing 
mechanism is explicitly based on the compression of a spring. Using Hooke’s 
Law, which says that when one end of the spring is compressed by a distance s, 
the spring exerts a force in the other direction of magnitude k - s for some 
constant k, find an equation for s in terms of ¢ and this “spring constant” k. 

By the way, with most scales, one can observe a scale reading greater than W, 
but it’s practically impossible to observe a reading anywhere close to 3W, since 

_ the chain falls so rapidly and the scale usually responds so slowly. 


\_/ 
N 


17. The usual analysis of a pendulum involves decomposing the gravitational 
force F of magnitude gm on the bob into a force F; in the direction of the 





pendulum string and another force Fg tangent to the path of the bob. The 
string 1s also exerting a force, F2, on the bob, which is assumed to point along 
the direction of the string. We must have F. = —Fj, since we assume that 
the bob stays at a constant distance from the pivot point, keeping the string 
taut but not stretching it out. (This reasoning is all examined more carefully in 
Chapter 6.) Thus, the net force on the bob is F +(—F,) = Fg, and consequently 
the acceleration of the bob, tangent to the circular path, has magnitude 


(1) dg = gsindg. 


If we consider @ as a function of time, and let / be the length of the string 
(the radius of the circle on which the pendulum bob moves), then equation (1) 
yields 


(P) ope: > sin P30. 


48 Chapter I 


Although this simple little equation can’t be solved explicitly, some important 
exact information can be derived using only trivial mathematical manipulations: 


(a) For convenience, choose the origin O to be the point from which the pen- 
dulum hangs. For any a > 0 consider the path 


y(t) =a-c(t/Ja), 


which follows a circle with radius @ times the radius of the path c, but with the 
time reparameterized by the factor 1/./a. Then the angle #(t) for y satisfies 


O(t) = O(t/Jo). 


Conclude that g 
on ond = 0, 
a 


so that y gives the path of pendulum bob with length @ times that of the original. 
(b) The period of a pendulum, for a given fixed initial angle 69, is the time 
required for the pendulum to return to this position, after first swinging through 
to the angle —69. Conclude that the period of the pendulum described by y 
for the angle 4 is /a times the period of the pendulum described by c for the 
same angle 69. Briefly, the period of a pendulum is proportional to the square 
root of its length. 

(c) Instead of varying the length of the pendulum, let us instead assume that 
we have changed the acceleration g to g-a@. (One way of doing this would be 
move the pendulum to the moon, for example, though in Addendum 1B we 
instead consider the possibility that g depends on the material comprising the 
pendulum bob.) In this case, show that the motion of the pendulum is now 


described by the curve 
y(t) = c(t/Va), 


and conclude that the period of a pendulum is inversely proportional to the 
square root of the acceleration g. 

(d) The speed of the pendulum bob at time ¢ is v = /|6’(t)|. Use the pendulum 
equation to show that if A(¢) is the height of the pendulum bob at time f, then 


g-h(t) + v(t) = constant 


_ (compare page 93). 





as ~ 18. Suppose that the suspension point of a pendulum is moving horizontally, 
a(0) a(t) 


0 


@ 


Newtonian Mechanics 49 


with its distance from the position at time 0 being a(t). Show that we now have 


1 


OQ” + = sin 0 = -— cos 6. 


In particular, for an oscillating motion a(t) = a coskt, we have 


Ov 4 > sin 6 — =k cos kt. 


19. A pendulum bob hangs without swinging in a car accelerating with constant 
acceleration a up a hill at angle 0. 


6 


(a) ‘The total force on the bob, not counting the force in the direction of the 


string, is 
G = Vg? +a? + 2ag cos 8, 


and the angle @ that the string makes with the vertical satisfies 
6 

re oe | 

with @ < 6. 


_ (b) Find the equation for the motion of the pendulum bob if it is set swinging. 





ew 20. A pendulum bob hangs at a constant angle 6 > 0 from the vertical, and 
moves along a horizontal circle with constant angular frequency a. 





(a) Assuming that the only forces on the bob are the constant force of gravity 
downwards, and a force F2 in the direction of the string, of length /, conclude 
that 

pg 


lo? 
(b) ‘This solution makes no sense for a < g/l. Note that in all cases, one 


possible solution is 6 = 0, 1e., the bob hangs straight down. Fora > /g/J, 
what happens when a bob moving in this way is perturbed slightly? 


50 Chapter 1 





Dt When our pendulum swings through a very small angle, so that 6 in equa- 
tion (P) is always very small, we use the fact that lim (sin @)/@ = 1 to simplify 
our equation to sa 


(P’) g” 4 =6 — 0, 


obtaining an equation whose period we can easily determine [the accuracy 
with which we approximate the period is examined in Problem 23]. Setting 
w = V¥ g/l, so that our equation becomes 


6” +76 =0, 


the solution 1s of the form 6(t) = asinwt + Bcoswt for constants a and B. It 
is convenient to assume that the pendulum hangs straight down (@ = 0) at time 
t = 0, so that we have 

O(t) =asinat; 


here @ is clearly the maximum angle from the vertical that the pendulum 
reaches. 


(a) Conclude that the period T of the pendulum 1s given by 


T =2nyVl1/g. 


Since we can measure T very accurately by counting the number of periods 
through which the pendulum swings in a long time interval, we ought to be 
able to use this equation to find g accurately. In practice, however, it is impos- 
sible to determine the length of the string holding the pendulum with sufficient 
accuracy, since the string stretches a bit when the pendulum is set in motion, so 
we need something like a pendulum consisting of a solid rod, which we discuss 
in Chapter 6, with particular attention in Problem 6-3. 





Newtonian Mechanics 51 


(a) We have y = Lsin 6/2, and L = 2/* — 21 cos@ ~ 2/* — 2/1 for small 6. 
Conclude that for small oscillations, y satisfies the same equation as 6. 

(b) We also have x = / — V/* — y?, and it follows that x also satisfies the same 
equation. 

Another approach is indicated in Problem 3-16. 


23. Returning to the exact equation 
g” 4 > sin 6 iG 
multiply by 0’ to conclude that 
p= 7 cos9+C 
for a constant C, and then that 
6’* = 2w7(cos 6 — cosa), 


so that the period T is given by 
2 . dé 
V2 J-a Vcos 6 — cosa 


With a little work, this can be expressed in terms of standard elliptic integrals: 


ji-— 


(a) Use the identity 


cos 6 — cosa = 2 ( sin’ Su sin” 5) 
2 2 


gr Dn sepa 


— sin” 


to get 


Now we are going to use the substitution 


e ] = . 1 : e 
sin 50 = sin 5@- sin x. 


Noting that 


21 7s eS 
(sin 5% — sin 0) = sin 5@ cos x, 


conclude that 


4 [~ dx 
er y1l— sin” Fol sin’ x 


52 Chapter I 


Setting 
1 


kK =sin 5a, 


we have finally 


H/2 dx 
a4 


1 —k? sin? x 
Note that this approaches 27 ///g ask — 0,1-e., for small a; on the other hand, 
it approaches oo as k — 1, 1.., for a = 2, where the pendulum is swinging 
almost all the way to an upright position above the pivot point. 
Expanding the term (1 — k? sin? x)~!/2 by the binomial theorem, we have 


T =2nvI/g (1 + 4 sin® Sa+ Asin* fat: -), 


For a < 14°, the first correction term is approximately 1 part in 20,000. 





~< 24. Although we won’t be considering electricity and magnetism in any detail 
in this volume, there’s one simple application of interest in regard to the basic 
laws, and we will refer to it later in Addendum 5A. 

A magnet produces a “magnetic field” B, which can be thought of as the force 
that the magnet produces on a small “test mass” of iron (like the iron filings used 
to show the “lines of force” around a magnet, which are the integral curves of 
the vector field B). 

The magnet also produces a force on a moving charged particle, of mass m 
and charge g, which depends upon the velocity v according to the so-called 
Lorentz Force Law, F = q(v x B) with proper choices of units for force and 
charge, so that the particle satisfies 


mv’ = q(v xB). 


(If you sense something terribly wrong with this law, good for you! See Adden- 
dum 5A.) 

Magnetic fields can also be created by currents, and we can get a constant 
magnetic field with a “solenoid”, a long cou of wire with a current running 


through it, which produces a magnetic field B that is nearly constant inside the 
coil, except near the ends. (Of course, the magnetic field of a magnet is also 
nearly constant at large distances from the magnet.) 


Newtonian Mechanics 53 


(a) If B points along the third axis, with constant length B, then the components 
v; of v satisfy 


v7) = avr 
: w= qB/m and v,=0. 
v2 = -Wvy 


Using the obvious solution (compare the footnote on page 339) 


v1 = acos(at) 


v2 = asin(@t), 


show that the path of the particle is a helix with constant velocity. Notice that w, 
and therefore the number of revolutions per second, 1s independent of both the 
initial velocity |v(0)| and the radius a of the orbit. It also depends on the ratio 
g/m of charge to mass, the basis for J. J. Thomson’s experiments of 1897 to 
measure this ratio for the electron. 

(b) ‘The pitch h of the helix, the vertical distance gained through one revolution, 


is given b 
mee Q je we 2xmus(0) 


qB 


25. In our discussion of Newton’s proof of the parallelogram law for forces, we 
didn’t mention that at the very least one has to assume that the law holds for 
collinear forces! In other words, we must appeal to experiment to conclude that 
“forces in the same direction are additive”, remembering that our experimental 
cart is used to define the measurement of force, as on page 16: If our cart needs 
to be given the acceleration a; in order to compress the spring to length Lo 
when some object is placed in the carriage of the air-trough, and we replace 
the single spring with two springs of the same construction, we find that the 





cart must be given the acceleration @; = 2a, in order for the two spring to be 
compressed simultaneously to the length Lo. Equivalently, the mass m of our 


body can be determined by 
ao 


m= 
a, /2 





54 Chapter 1 


using two springs instead of one (ao being the convenient acceleration that we 
used on our “unit mass’ to determine Lo with just one spring). 
By considering two copies of our object on the carriage, with a spring behind 





each, conclude that the mass of this new object is 2m (the general rule for 
additivity of mass following in a fairly obvious way). 


26. Accurate weighing basically requires a good “balance”, which at heart is 
nothing more than a lever with equal arms; without knowing the law of the 





lever, we can still say that two objects have equal weight if they balance. Just to 
be on the safe side, of course, we also switch their position to check that they 
still balance. 

Once we’ve chosen some object for our “unit” weight, we can, for example, 
produce two new bodies that balance against each other, and that together 





balance against the unit weight, each of which could then be called a half-unit 
weight. Similarly, we could create any number of 1/10-unit weights, 1/100-unit 
weights, etc., and then weigh any object to any desired degree of accuracy by 
balancing it against a suitable collection of the units. 

This procedure might make it appear that “additivity of weight”, and thus 
“additivity of mass”, has simply been declared by fiat. What implicit experi- 
mental results are involved? 


CHAPTER 2 


NEWTON’S ANALYSIS 
OF CENTRAL FORCES 


ook 1 of the Principia is called ‘The Motion of Bodies. ‘The first section 
begins with geometric considerations that we would nowadays rephrase in 
terms of limits, but the second immediately begins with “Kepler’s second law”, 
though Newton does not mention Kepler’s name in this regard.! 
Kepler’s second law, which was actually the first he discovered, says that the 
radius vector of a planet sweeps out equal areas in equal time, or equivalently, 
that the area swept out in time f¢ 1s proportional to f¢. 





Newton pointed out that this is a consequence of the fact that the gravitational 
force that the sun produces on the planet it always directed along the line from 
the planet to the sun, or equivalently, that the acceleration of the planet is always 
directed toward the sun—the specific magnitude of this force being irrelevant. 
Or as Newton expressed it 


Proposition 1. The areas which bodies made to move in orbits describe by 
radu drawn to an unmoving center of forces he in unmoving planes and are 
proportional to the times. 


This turns out to be extremely easy to prove analytically, especially if we 
use the cross-product of vectors. For simplicity assume that our force is always 
directed toward the origin O, and let c be any particle. We always have 


(cx vw )+(c’ xv) =(c xv)+(v xv) 


=cCXV, 


(c xv)’ 


so if v’ points along c, we just get (c x v)’ = 0, and consequently the relation 
(*) cXVvV=w Ww aconstant vector. 


' Some discussion of this matter may be found in Cohen and Whitman [1]; pg. 21], which 
throws light on numerous other questions of this sort. 


99 


56 Chapter 2 


If w = 0, then v(t) always points along the line from O to c(t), and our 
particle must simply be moving along a straight line towards O. If w 4 0, 
then, since the inner product satisfies 


0 = (c(t) x v(t), c(t)) = (w, c(t), 


we see immediately that c(t) always lies in one plane. 
Moreover, c(t) x v(t) has a natural interpretation in terms of the area swept 
out by the radius vector. For small h, this area, S(t +h)—S(A), is approximately 





the area of the shaded triangle, and thus approximately 


sle(t) x le@ +4) —e(@)]h 


Consequently, in the limit we have 


] 
— |i t) x 
ha 


sle(t) x v(t). 


S’(t) 


| 


c(t +h) —c(t) 
h 


Thus, (*) implies that S’(¢) is constant, or that S(t) is proportional to ¢. 
This approximation argument isn’t really necessary, for we we can simply 
write C as 
c(t) = r(t) - (cos O(t), sin O(¢)), 
compute v = c’ and observe that 


lox v| = r76’, 


and , -r*@’ is just the integrand required to compute areas in polar coordinates. 
For future reference, we record the result 


(*') r?0’ = |w| = A, say. 


It might seem hard to improve upon this analytic proof, but Newton provides 
a geometric proof for Proposition 1, approximating the curve by a polygon, 
which is not only simple, but also seems to show just why the proposition is 
true. 


Newton’s Analysis of Central Forces 57 


As a sentimental gesture, we give a replica of the diagram that Newton uses 
in his proof 





Newton assumes that the particle follows the path ABCD ... , receiving “im- 
pulsive” forces at short equal intervals of time, and that these impulsive forces 
at B, C,... are always directed toward S, so that the path sweeps out the tri- 
angular areas ASAB, ASBC,.... Newton merely has to point out that, if not 
for the impulsive force at B, the particle would move to c, with Bc = AB. In 
this case, it would sweep out the triangle ASBc, which has the same area as 
ASAB (since they have equal bases, and the same height). ‘The impulsive force 
applied at B will instead send the particle to C, which will be at the diagonal 
of the parallelogram formed by Bc and a line BV pointing along SB, since we 
are assuming that the force is directed toward S. ‘This means that Cc is parallel 
to BV, and this in turn means that ASBc has the same area as ASBC (since 
these triangles have the common base SB and the same height above that base). 
In short, the area of ASAB is the same as the area of ASBC, and so on, all 
along the path!! 


' Various details about the rigor of Newton’s arguments may be found in Pourciau [4]. 


58 Chapter 2 


It is also noteworthy that Newton expressly states a converse of Proposition 1, 
the proof being pretty much the same, and one might not think it would be of 
much interest. In fact, Newton never mentions it again (even though, as we 
shall see later, it plays a crucial role): 


Proposition 2. Every body that moves in some curved line described in a 
plane and, by a radius drawn to a point, ... describes areas around that 
point proportional to the times, 1s urged by a centripetal force tending toward 
that same point. 


Newton’s proof of Kepler’s second law is often presented in elementary physics 
books because of its dual virtues of simplicity and transparency. Unfortunately, 
this presentation usually marks the end of the exposition, with the lament that 
Newton’s further investigations require many abstruse properties of conic sec- 
tions which are unfamiliar to us nowadays. ‘This turns out to be a double 
disappointment. While it is quite understandable that a geometric proof would 
use Many geometric properties of conics, the real mystery is how an hypothe- 
sis about inverse square forces is going to be related to geometric properties of 
conic sections. Moreover, although we won’t pursue Newton’s argument in its 
entirety, it turns out that Newton’s strategy for the proof is extremely clever.! 

To see how Newton relates the forces to the geometry, we need only follow, 
with slight modifications, a few steps that he adds a bit later on. In the dia- 
gram for Proposition 1, breaking up the motion into small intervals 6 of time, 
consider the segment BB’ from B to the midpoint B’ of the diagonal AC. This 





is half of BV, which represents the displacement due to the central force of 
magnitude F at B, and this distance is just 5 F6*. So, in the limit, 
a ee ee 


Or as Newton phrases it, in inexact poetical-sounding terms, so much more 
beautiful than limits and epsilons and deltas, 


' So clever, in fact, that it has caused arguments up to recent times—see the remarks in 
the last section of Chapter 7. 


Newton’s Analysis of Central Forces 59 


If... a body revolves in any orbit about an immobile center and de- 
scribes any just-nascent arc in a minimally small time, and if the sagitta 
of the arc is understood to be drawn so as to bisect the chord and, when 
produced, to pass through the center of forces, the centripetal force in 
the middle of the arc will be as the sagitta directly and as the time 


twice inversely. 
C 


Xhiy 


S A 


It shouldn’t be necessary to explicate the old-fashioned term sagitta [Latin for 
arrow], because in this instance Newton’s statement explicitly indicates that he 
is referring to the segment XY of the line through S and the midpoint X of the 
chord AC. The fraction 4 doesn’t appear in Newton’s statement because the 
result is phrased as a proportion: the ratio of the centripetal forces at points A 
and A’ is the same as the ratio of the limits lim X Y/6? for arcs starting at A 
and A’, respectively. 9 

Newton next considers a point P at time ¢t on a curved path around the 
center S, and two points Q, Q’ on the path, at two nearby times t—6 andt +4. 


Q 
Sev 


S 


Then the force at P is proportional to 


But, by Proposition 1, 6 is proportional to the area of the (curved) triangle SPQ, 
and thus, in the limit, to OT x SP, where QT 1s perpendicular to SP. Thus, 
finally, the force at P is proportional to 


: PX 
$0 (SP)2 x (OT)? 


60 Chapter 2 


Newton, however, actually presents a figure that has a tangent line drawn 
at P, and the line QR drawn parallel to SP, with the assertion that the force 


CS 


at P is proportional to 


: OR 
(*) $0 (SP)? x (OT)? 


Of course, QR is not actually equal to PX, but it is apparently obvious to 
Newton that it is equal to second order so that the limit still holds.! 


And now Newton is all prepared to show that the orbit of an object moving 
under an inverse square force is a conic section. Newton begins in a way that 
might seem strange to us, by proving a partial converse of this assertion: 


Let a body revolve in an ellipse; it 1s required to find the law of the centripetal 
Sorce tending toward a focus of the ellipse. 


In other words, given a path c lying along an ellipse, if c” always points towards 


one focus of the ellipse, Newton is going to show that 


k 
|c(t)| = —— _ for some constant k, 


d(t) 


where d(t) is the distance from c(t) to the focus. 


The relation (*) is the key to all this. In fact, in view of (*), our assertion 1s 
equivalent to saying that for an ellipse we have 


OR 
lim ——=—~ is aconstant. 
56-0 (O T )? 


This limit has nothing to do with forces, and is completely determined by the 
shape of the ellipse. It could even be computed by a double application of 


| For a more exact statement, and its somewhat intricate proof, see Pourciau [5]. 


Newton’s Analysis of Central Forces 61 


L’H6pital’s Rule: If F(é) denotes QR and G(6) denotes (OT)*, then we have 
hm F'(s) = lim G’(6) = 0, and 


i OR _ F'(0). 
s0(OT) G0)’ 


when the unpleasant calculation (Problem 7) is carried through, it turns out that 
F"(0)/G"”(0) is independent of the point P, and in fact = a/2b? for the ellipse 
shown below; the reciprocal, 2b?/a, is the length of the classical datus rectum of 
the ellipse, the segment cut off by the ellipse on the vertical line through one of 
the foci. 





Newton proves exactly this result geometrically, and the proof is indeed long, 
complicated, and depends on numerous results about the ellipse. For a complete 


exposition of this proof see Newton [pp. 325-330]. 


Newton then gives a simular proof for a body moving on a hyperbola, and 
finally a proof for a body moving on a parabola. 


And immediately afterwards, the result we really wanted appears as a corol- 
lary: 


Coro iary |. From the last three propositions it follows that if any 
body P departs from the place P along any straight line PR with any 
velocity whatever and is at the same time acted upon by a centripetal 
force that is inversely proportional to the square of the distance of 
places from the center, this body will move in some one of the conics 
having a focus in the center of forces; and conversely. 


In the first edition of the Principia this is a// that appears—the result is claimed 
to be a corollary of its converse(s)—but in the second edition Newton added a 
few sentences to aid the reader (with a bit of rewording in the third edition): 


62 Chapter 2 


Coro.iary |. From the last three propositions it follows that if any 
body P departs from the place P along any straight line PR with any 
velocity whatever and is at the same time acted upon by a centripetal 
force that is inversely proportional to the square of the distance of 
places from the center, this body will move in some one of the conics 
having a focus in the center of forces; and conversely. For if the focus 
and the point of contact and the position of the tangent are given, a 
conic can be described that will have a given curvature at that point. 
But the curvature is given from the given centripetal force and velocity 
of the body; and two different orbits touching each other cannot be 
described with the same centripetal force and the same velocity. 


Here, in slightly different terms, is a complete argument, where for simplicity 
we simply work in R*, and we choose the origin O as the point toward which 
the force is directed. Given a point P, and a tangent vector v at P, we want to 
find a curve c = (ci, C2) with c(0) = P and c’(0) = v satisfying 
_ ek —c(t) 

le? le@| 
where k is a given constant (the factor —c(t)/|c(t)| is just a unit vector pointing 
from c(t) to the origin). 


Since we know c’(0), and (*) gives us c’’(0), we know what the curvature k 
of c at 0 should be, since this is given by 


_ €1/(0)e2"(0) = €2'(0)c1"(0) 
— (cy2(0) + €22(0))7/2? 


a formula known, in essence, to Newton, though curvature was defined in terms 
of the osculating circle. 

Now consider a conic section K having O as a focus, which passes through P, 
and is tangent to v at P, and whose curvature at P is this k, assuming for the 
moment that such a conic section exists. Consider a curve y with y(0) = P, 
which traverses K in such a way that the areas cut out by radi from O 1s 
proportional to the time. Such curves are determined up to a multiplicative 








(*) c"(t) 


(**) 


area = constant :f 


curvature = K 






Newton's Analysis of Central Forces 63 


change of parameter; by choosing the appropriate multiplicative constant, we 
can arrange for y’(0) = v. According to our converse Proposition 2, which 
Newton carefully provided, we have 


_ kk -~@) 
Ol? |r| 


for some k. But we must have k = k, since we chose y so that its curvature at 
y(0) = P would be the k given by (**). 

Thus, y is a solution of our differential equation (*), and by uniqueness (which 
of course Newton and all his contemporaries implicitly assumed) it is the only 
possible solution. 








y"(t) 


Newton’s argument might strike us as a little weird, starting as it does with 
the converses of the result we want, but a geometric proof almost has to be of this 
nature: it’s a lot easier to start with a geometric object—an ellipse, or hyperbola, 
or parabola—and deduce a formula for forces, then it would be to start with 
the formula for forces and somehow conjure up these geometric figures. 

The only slight lacuna in the argument is the existence of a conic section with 
the required curvature, but Newton gives a detailed geometric solution to this 
problem a little later on in Proposition 17 of Book 1 (a somewhat unfortunate 
shufHing of the proper expository order). Analogously, there is the analytic 
problem of describing the solutions of (*) in terms of the initial values. We will 
defer this problem, and other aspects of planetary motion, to Chapter 4, where 
we give a connected presentation of all the relevant material. 

However, just to get some idea how the orbits are determined by the initial 
conditions, we will examine qualitatively the special case where our initial ve- 
locity v is perpendicular to the radius vector to the point. In other words, we 
consider the case where we start at a vertex of the conic section (more specif- 
ically, in the case of an ellipse, we are starting at the vertex at the end of the 
major axis). 

We assume that our force is exerted toward the origin O, with magnitude 

J 
le(t) |? 
and that we are looking for a solution c through a fixed point c(0) = (R, 0), 
with an initial velocity (0,v) of magnitude v. In this case, we easily compute 


(taking the constant k as 1 for simplicity) 


64 Chapter 2 


that the curvature « of c at 0 given by (**) on page 62 is 
1 
v2 R?’ 
so we have to find appropriate conics with varying «, where small « correspond 
to large v and vice versa. 
Using the parameterization 
c(t) = (acost,bsint) 


of an ellipse with axes a > b, we easily compute that the curvature « at the end 


of the major axis is 
a 


p2° 
There are two classes of ellipses with a vertex at (R,0) and a focus at O, de- 
pending on whether (R,0) and O are on opposite sides of the center of the 


Ks 





ellipse, or are on the same side. In both cases we have 
(a— RY =c* = a’* —b’, 


which gives 
b* = 2aR — R’, 

so that the curvature at the end of the major axis Is 

_ a 

— 2aR — R? 


whose graph is shown below. 


K a> R/2 


2aR — R?2 





Newton’s Analysis of Central Forces 65 


For a = R we have k = 1/R which 1s a circle, corresponding to 


l 


R v2R? 


or v = vy 1/R. As we make v smaller we obtain a family of ellipses from the 
first class, which converge to a straight line from our point (R, 0) to the origin. 


As we make v larger, we obtain a family of ellipses from the second class, 
whose curvatures at the vertex approach 1/2R, and these ellipses approach 





a parabola with curvature 1/2R at its vertex. Still larger values of v give a 


66 Chapter 2 


family of hyperbolas approaching the vertical line through (R, 0), which is the 


orbit when v is “infinitely large”, or equivalently, when the gravitational force 
is negligible. 


Newton's Analysis of Central Forces 67 


PROBLEMS 


stay 1. (a) Right after the proof of Proposition 2, stated on page 58, Newton inserted 
a very simple result, concerning circular orbits (which the elliptical orbits of the 
planets very closely approach). ‘The orbit can be described by 


c(t) = r(cosa@t, sinat), 


as on page 18, where we computed c”(t). Correlate this with Newton’s state- 
ment of the result (Newton [2; pg. 449]): 


The centripetal forces of bodies that describe different circles with uniform 
motion tend toward the centers of those circles and are to one another as the 
squares of the arcs described in the same time divided by the radu of the circles. 


(b) Conclude that for two such orbits, the values of |c”(t)| (and thus of the 
forces causing the motion) are inversely as the squares of the radu if and only if 
the periods are as the 3/2 powers of the radu. 


This is Kepler’s third law, and Newton interjects a brief Scholium stating that 
this 


“holds for the heavenly bodies (as our compatriots Wren, Hooke, and 
Halley have also found out independently). Accordingly, I have de- 
cided that in what follows I shall deal more fully with questions relating 
to the centripetal forces that decrease as the squares of the distances 
from centers... ” 


Later on, after proving that inverse square forces give elliptical orbits for the 
planets, Newton proves the general case of Kepler’s third law, that the peri- 
ods are as the 3/2 powers of the major axes. We will consider this result in 
Chapter 4. 


2. It appears! that during the Plague Years of 1665-1666, Newton tested the 


inverse square law for the case of the moon revolving around the earth. 

It was know, from triangulation, that the distance from the center of the earth 
to [the center of] the moon was close to 60 times the radius of the earth. For the 
length of a degree of latitude at the surface of the earth (i.e., 1/360 of the earth’s 


| For details, suppositions, caveats, etc., see Chapter 4 of Herivel [I], as well as the 
discussion in Cohen and Whitman [1; pp. 67—70]. 


68 Chapter 2 


circumference) Newton took the common estimate at that time of 60 miles, the 
mile at that time apparently taken as 5,000 feet. ‘Thus, the circumference of the 
earth was to be reckoned as 1,080 x 10° feet, the radius as 171,887 x 10? feet, and 
the distance from the center of the earth to the moon as 1,031,322 x 10? feet. 
The period of the moon is very close to 27 days and 8 hours, or 39,360 minutes. 


(a) Assuming that the moon travels with uniform velocity in a circle around 
the earth, compute its velocity, and then use Problem 1-5 (a result known to 
Newton early on, presumably by something like the geometric proof in that 
problem) to compute that its acceleration towards the center of the earth is 
about 26.27 feet/min’, or .0072972... feet/sec”. 

(b) Using the fact that the acceleration of bodies near the earth’s surface is 
32 feet/sec* (this was known from pendulum observations), compute that the 
acceleration of the moon towards the center of the earth should be about 
00888... feet/sec?. 


In 1682, the French astronomer Picard announced a more accurate measure- 
ment of 69.1 miles for the length of a degree of latitude, which gave Newton a 
much better correlation, carried out in detail in Book 3 of the Principia, Newton 
[2; pp. 803-804]. Although the importance of this correlation was mentioned 
in Chapter 1, Newton actually tackled much more detailed problems in Book 3, 
a large part of which was devoted to showing how the intricate details of the 
moon’s motion could be derived from the principles of gravitational physics. For 
an extremely instructive discussion of these matters, see Cohen and Whitman 


[1; pp. 246-264] (as well as other sections of Chapter 8 of that book’s Guide). 


After proving the relation (*) on page 60, Newton does not immediately tackle 

the problem of planetary orbits, but instead devotes the remainder of the section 

to other problems that indicate how (*) may be used. His first example is one 
_ that his methods handle quite nicely: 






=i, 3. We want to find what central force at a pomt S will cause a body to move 
on the circumference of a circle when S is not the center of the circle. Newton 
presents a rather complicated figure that 1s most easily understood by consider- 
ing it in steps, shown in the diagram on the next page: 

(a) We consider a point P on the circle, and a nearby point Q, and draw the 
line from P to S, mntersecting the circle at V. 

(b) We then draw the diameter VA. 

(c) We next draw the line TQ perpendicular to SP, and extend it until meets 
the tangent line to the circle at P in the point Z. 

(d) Finally, we draw the line LR parallel to VP intersecting the circle at L 
and the tangent line at R. 


Newton’s Analysis of Central Forces 69 


In (c) we also draw the radius to P, which is perpendicular to ZP and con- 


clude that ZTZP = a, so that the right triangle Z7P 1s similar to the right 
triangle VPA 

We also want to recall the elementary geometry theorem that for two secant 
lines through a point, we have the relation BC-BD = BE- BF, and the obvious 





consequence when one of the secant lines is actually a tangent line (F = F). 


70 Chapter 2 





(a) We have 
RP ZP 
OT ZT 
_ AV 
= ag 
SO 
QR-RL (AV)? 
(OT)? (PYV)?’ 
and hence 
(SP)? -(QT)? (SP)? -(PV)? 
+ = RE. 
OR (AV)? 
(b) Thus, 


_ (SP)?-(QT) _ (SP)?- (PV)? 
by, = A, 
Q—-P OR (AV) 


and consequently by equation (*) on page 60, the force is inversely proportional 
to (SP)? -(PV)?. 


This result appears in the first edition of the Principia, but three remarkable 
corollaries first appear in the second edition: 


(c) (Corollary 1) Ifa particle moves in a circle under a central force directed to 
a point V on the circle, then the force varies inversely as the fifth power of the 
distance. (And, since this 1s true for any point V on the circle, the exact same 
force law with two different centers can give the same orbits.) 

(d) More generally, suppose that a particle moves on a circle under two different 
central forces, one with center at R and one with center at S. Let SG be drawn 





parallel to RP, mtersecting the tangent line to the circle at P in the point G. 
Then the ratio of the first force to the second 1s 
(RP)? -(PT)3 ———-SP- (RP)? 
(SP)? -(PV)? (SP)? (PV) 
(PT)? 


Newton's Analysis of Central Forces 71 


Conclude that the ratio of the first force to the second is 


RP)? - SP 
(Corollary 2) ae 
(ec) By considering the osculating circle to the orbit at P, conclude that the 
same results hold for an arbitrary orbit: If a body moves on the same orbit 
under central forces directed toward R and S, then the ratios of the forces 1is 
again 


(RP)? - SP 


(Corollary 3) “(sep 


where SG is parallel to RP and meets the tangent line to the orbit at P in the 
point G. 


\ if 


Kas 4. After these corollaries, and a result about an inverse third power law that we 
will consider in Chapter 4, Newton considers elliptical orbits under a central 
force directed toward the ongin, rather than toward a focus. His treatment of 
this problem is almost as complicated as his treatment of the inverse square law, 
but we easily compute that a particle moving under the orbit 


c(t) = (acosat,bsinat) 


has c’(t) = —a*c(t), so that it is a possible motion under a central force varying 
directly as the distance to the origin, and obviously the same holds for any ellipse 
centered at the origin. 


(a) Newton also states the converse as a corollary, again without any further 
indication of the proof. For an analytic proof, consider a particle c(t) = 
(x(t), y(t)) moving under a central force directed toward the origin, of magni- 
tude k?r at distance r. This gives the equations x” +k*x = Oand y”+k?y = 0, 
so that x and y are each linear combinations of coskt and sinkt, say 


A C 
(x,y) = (coskt, sin kt) & 4 


Solve for coskt and sinkt in terms of x and y, and use cos* + sin* = 1, to 
obtain an equation for a conic centered at the origin, which is necessarily an 
ellipse, since x and y are bounded (“elliptic harmonic motion’). 

(b) Now comes the really interesting part, which also first appears in the second 
edition. We will need another property of the ellipse, but it is one that we can 
easily derive from familiar ones. Recall that the ellipse with major axis 2a and 


72 Chapter 2 


foci F; and F? 1s defined by the property that Fy P + PF2 = 2a for all points P. 





It also has the “focal point” property that a light ray starting from Fj passes 
through F, 1.e., that the two angles indicated in (a) are equal. 

In diagram (b) below we have drawn lines parallel to the tangent line at P 
through O and F2. The two angles indicated by thick arcs are equal, so the 





two thick segments have the same length 6. And the two segments with lengths 
indicated as a are equal because FyO = OF). Thus 


20+ 28 = 2a. 


Finally (c), moving XP over to OG we see that a line through the origin O 
parallel to the line F; P always intersects the tangent line through P at a point G 
with OG =a. 

Use this result together with part (a) and Corollary 3 in the previous problem 
to give an alternate proof that if a particle moves in an ellipse under a central 
force directed toward a focus, then the force must vary inversely as the square 
of the distance from the focus. 


Addendum 4E considers in greater detail some of the questions raised by these 
observations of Newton. 


5. Later on in the Principia Newton analyzes orbits much more thoroughly, 
essentially determining not only their shape, but also their parameterization, 
even allowing himself a formula or two now and then; as we will note later on, 


Newton’s Analysts of Central Forces 73 


the results correspond to standard modern formulas involving integrals. But 
first he disposes of the question of parabolic orbits: 


If a body moves in a given parabolic trajectory, to find its position at an 
assigned time. 


The graph of f(x) = x* is a parabola with focus at (0,4). Compute the 
shaded area in the figure on the next page, as a cubic expression in x, and 





conclude that we can find the parameterization of a planet moving along this 
parabola, in terms of solutions of cubic equations. 

The case of parabolic orbits is very special in this regard, because attempts to 
find the corresponding areas for elliptical orbits involve elliptic integrals, which 
cannot be expressed in elementary terms. Amazingly enough, Newton actually 
proves that we can’t expect to find algebraic parameterizations in the case of 
an ellipse, and his argument constitutes, in the words of Arnold [3; pp. 83 ff] 
“an astonishingly modern topological proof of a remarkable theorem on the 
transcendence of Abelian integrals”. ‘lhe remarks in Arnold are expanded 
somewhat in Chandrasekhar [2; pp. 133 ff], scholastic grousing can be found 
in Cohen and Whitman [1]; pp. 138-139], and a comprehensive account of the 
problem can be found in Pourciau [3]. 

By the way, Newton analyzed the motion along parabolic orbits purely ge- 
ometrically (see Chandrasekhar [2; pp. 130—131]): If the line L 1s the perpen- 
dicular bisector of SO, and the perpendicular bisector of SP intersects L at 
H, then, as Chandrasekhar says, “With these constructions (which passes un- 
derstanding) Newton proves” that the pomt H moves along L with a uniform 
velocity equal to 3/8 of the velocity of the planet at O! 


P = (x, x7) 


i Ss H 


74 Chapter 2 





~s 6. This problem gives a rather ad hoc solution of the equations of motion for a 
particle under a radially symmetric central force. A more systematic derivation 
is presented in Chapter 4, but this problem is important not only as preparation 
for that derivation, but as an introduction to other techniques. 

Recall that for a particle c moving under a central force, the vector c x Vv 1s 
a constant vector w, so |c x v| is the constant 


(1) lcx vl =A, h = |wl. 
We write the equation of our particle as 
c(t) = r(t) - (cos O(t), sin O(t)), 


so that we have 
r~ = (c,¢), 


and consequently 
rr" +7’? = (e,v’) + |v/?. 


Since our force is radially symmetric, 1e., it depends only on the distance 
from O, we have 


mv’ =—(f or)-— 


for some function f (with the — sign added so that a positive f corresponds to 
an attractive force). [hen 


; r 
(oN a= (er), 
m 
and consequently 


(2) rer? = ——(f or) + Iv? 


(a) Show that for any vectors x,y € R? we have 
(x,y)? + [x x y|? = [xl*- ly’. 


You can either give a geometrically obvious explanation, or reduce to the case 
x = (1,0,0) and y = (1, y2,0) and calculate.) 
(b) Apply this result to eliminate v from (2) and conclude that 


ye We l 
r" = (for), 
m 


r? 


Newton’s Analysis of Central Forces 75 


where / is given by (1). Consequently, 


h2 / 2 2 
(7 2 =] =—-—(for)r’ =-—(For)’, 
r mM ni 


where F is a primitive of f,1e., F’ = f. Thus, for an appropriate F we will 
have 


2 


2 h 
ES Fo a, 
(*) r | r) yo 


(c) We thus have a differential equation for r. We also have the equation 
(*’) r?6' =h 


(page 56), so knowing r gives a differential equation for 6. However, we seldom 
expect to solve for r or 6 directly. For example, in the case of an inverse square 
law, with an ellipse as solution, finding r or 6 as a function of t would essentially 
involve finding the areas of sectors of an ellipse, which involves elliptic functions. 
Usually we concentrate on finding the shape of the solution, rather than its 
particular parameterization, by finding ro 67!. 

From the equation 





ly _ yf. gl 
(roo) =7 00 Bro gat? 


which in the infinitely flexible Leibnizian notation can be written as 
dr _ dr [dé 
do dt[{ dt’ 


equations (*) and (*’) yield the differential equation 


dr r? 2 h? 
dd h m f re 


We can also “separate variables” to obtain 


16 = hdr | 


h2 
(24/2 (F or) ey 
m r 


76 Chapter 2 


so that 


ar) = + f hdr 
TO 


h2- 
r2 Belg ee 
m r 


(Only the r in the limit of integration on the right side is of significance—the 
other r’s could just as well be replaced by any other “variable of integration”, but 
notation of this sort, consistently used by physicists and others solving differential 
equations, is more convenient.) This determines 6 o r~! [the expression 6(r) 
means 6(r—!(t))], and thus ro @~!. The extent to which specific formulas can 
be written down depends on our ability to determine the integral in elementary 
terms; we will explore this question further in Chapter 4. 


7. Consider the ellipse x?/a? + y?/b? = 1, whose upper half is the graph of 
x +» (b/a)(a? — x”)'/?. The focus is the point (c,0) with c? = a? — b?. We 





consider the tangent line L at a point (Xo, yo) and the perpendicular distance 
f(x) from L to the point (x, y) = (x, (b/a)(a? — x”)'/*) on the ellipse. 


(a) Recall that the distance from the straight line with equation y = Mx + B 
to the point (C, D) is 
ICM —D+B 


M?+1 
So, if the slope of L is v, then 


b 
Vv2 + 1 f(x) = xv—y + constant = xv — —(a? — x”)!/* + constant. 
a 


Conclude that 


Vv? + Lf" (x0) = ba(a? — x0?)7?/?, 


Newton’s Analysis of Central Forces 77 


and then, using the value of v, that 
ie 32 —1/2 
I" ie 2 a DNA BID | 0 
f° (x0) = ba(a“ — xo") EF og i | 


4 2,2 41/2 
_ 2 2\-3/2| 4 —©¢ Xo 
= ba(a — XQ ) | aa | 
7 ba? 
a (a? os xo2)(a4 = c2x92)1/2 
(b) Let F(x) be the length of the line from (x, y) to L that is parallel to the line 
from the focus (c, 0) to (x9, yo), so that F(x) = f(x) sec B. 


I(x) > 


Using 
b 
tan @ = —x0(a" a ae 
we have 4 
tan(a + B) = ee (7 — toy, 
bxo 
while 
yo _ b (a®~ x92)? 
tana = = ——_______ 





Xo-C aA Xo-C 
Conclude that 
tan(a + 6) — tana 
1 + tan(a + 6) tana 


om Pee a*x9 —a*c — b? x9 
abxo(xo —C) 


tan B = 


ar — X9C 


xo(x0 — Cc) 

hae «tee 

(a? — x92)1/?. KAO) 
ab 


a2 — xoc 


(a? _ x97)!/2¢ 
ab 


78 Chapter 2 


Compute that 
ee Pee Ue 
epee 
ab 


and then that i 
F" (xo) = a 
(c) Now let 
yo _ b (a2 = x92)! 


b= — 
Xo —C a Xo —C 





be the slope of the line between the focus (c, 0) and (Xo, yo). For the distance 
g(x) from (x, y) to this line we have 


(ae a AN 
ge (: (a“ — xo") = (a? -y2) (u2 + 1)7}/? 


a Xg—C 


and for G(x) = g(x)” we have 


2b7 = (a* —cxoy’ I 
GG" XxX — 2 f XxX 2 SS SSS 00 SS eS 


Conclude, finally, that 
2b? 


G" AO). => 
Cer pare: 


CHAPTER 3 
CONSERVATION LAWS 


ie this chapter we will look at the basic “conservation” laws of mechanics, one 
of which, conservation of momentum, has already been briefly mentioned in 
Chapter 1. ‘The considerations of that chapter might lead us to consider a 
“system of particles” consisting of 


(a) certain particles c),...cx :R > R?, 


) 
(b) 
) 
) 


with positive masses m,,...,mK ER, 
(c) functions Ff: R > R?, 


(d) functions Fj; = —F;;: R > R?, 


with the following basic property: If we set 


F,=FP +) Fy, 
j 
then 


/ 
F; = Mj °C’. 


Here F*(¢) represents an “external force” on the particle c; at time t, while the 
F;;(¢) represent “internal forces” between c;(t) and c;(t), satisfying Newton’s 
third law, and consequently F;(t) represents the total force on the particle c; at 
time ft. For forces satisfying the “strong version” of Newton’s third law (page 25), 
condition (d) should also stipulate that F;;(¢) 1s a multiple of c;(t) — c;(¢). 

Of course, in practice, the F’ and F;; will often have simple expressions in 
terms of other functions. For example, the external force F/(¢) might be of the 
form m; -f (c;(t)) for a vector field f on R°, e.g., the gravitational attraction due 
to some external body, while the internal forces F;; might be some function of 
m;,m; and the distance |c; — c;|. 

Our more general definition allows all sorts of more complicated situations, 
for example one where the force F’ depends not only on ¢t and ¢;(t) but on 
the whole collection of {c;(t)}. A simple instance would be a system of many 
particles representing a space ship with rocket propulsion, where the direction 
of the force depends on the particular angle at which the space ship is rotated 
at any particular time. (Presumably, the space ship has more than one rocket, 
so that it can steer). 


79 


80 Chapter 3 


Conservation of momentum. Although conservation of momentum, as stated 
in Chapter 1, involved only internal forces, we can easily state a generalization 
allowing for external forces. We set F = }°; F?, the total external force. 


1. PROPOSITION (MOMENTUM LAW). The derivative of the total mo- 


mentum is the total external force, 
/ 
EF -= (= Mj: vs) ; 
i 


Here we have to regard the various Ff simply as elements of R*, rather than 
as tangent vectors at different points of R°, and similarly for the yj. 


This formulation gains considerable significance when we introduce the con- 
cept of the center of mass of the system {c;}, which represents the “average” 
position of the particles c; weighted according to their masses: 

fia DAT 
di Mi 
More precisely, we should define the center of mass as the particle consisting of 
the path C with the mass M = }°, mj. 

If all F’ = 0, so that >°; m; - vj is constant, then C” = ay: mc = 
oy NirVii = a (>: Mj ° vi) , so that we also have C” = 0. Thus, C’ is 
constant; in other words, the center of mass moves with uniform velocity. 


More generally, we have 


2. PROPOSITION. If F = )/, F? is the total external force, then 
F=M.-C", 


so that the center of mass particle simply moves as if it were acted upon by the 
total force F. 


PROOF. We have 
M.-C" = yo mi Ci" 


=) F; 
I 

Sg oe 
I l 7 

= > oF. Seo 
I 


Conservation Laws 81 


Although the “particle” C might not be one of the particles in our system, 
this result is seldom regarded as particularly “theoretical” — instead it allows us 
to get a very simple picture of very complex phenomena. For example, in the 
figure on page 10, showing a rod executing a complicated revolving motion, 
the center of mass, which does happen to be a point on the rod in this case, 
simply moves in a parabola, just like a point mass. A striking illustration may 
be obtained with a time-exposure photograph taken when a baton is tossed in 
the air, with lights at the ends and the center of mass, giving a picture like this: 





We usually think of a rod as a “rigid body”, a concept whose analysis we 
have still shied away from. At first, that might seem to make the result even 
more impressive: in a real rod, with all sorts of complicated intermolecular 
forces, which make it approximately “rigid”, but not truly so, it is still true that 
the center of mass moves according to a simple law. But that is a somewhat 
misleading way of construing the result, since rigidity of the rod is required in 
order to identify its center of mass with a particular point of the rod, on which 
we can attach one of the lights. 

Center of mass is often called “center of gravity”, a concept that goes back at 
least to Archimedes (cf. the Prologue). ‘These concepts are not identical except 
in a uniform gravitational field—which applies, of course, to reasonable sized 
objects on the earth’s surface—but the difference 1s often ignored. 


Conservation of angular momentum. ‘The use of the cross-product x at the 
beginning of the previous chapter could be regarded simply as a convenient ab- 
breviation for manipulations with determinants. But there is a more important 
reason why this special product of R? is significant. 

For any vector w € R?, consider the one-parameter family of maps B(t): R?> 
R?, where B(t) is a counter-clockwise rotation through an angle of t|w| radi- 
ans around the axis through w [choosing an orientation (v1, v2) of the plane 


82 Chapter 3 


perpendicular to w so that (v1, V2, W) is the usual orientation of R?]. Now con- 
sider the vector field generated by this one-parameter family. In other words, 
for each p € R? consider the curve Bp(t) = B(t)(p), and then look at the 
tangent vector Xp of this curve at 0. 





To compute Xp geometrically, we note that Xp is clearly perpendicular to 
both p and w. Its length is also easy to determine. When p happens to hie 
in the plane perpendicular to w, as in (a), the point p rotates in a circle of 
radius |p|, and Xp has length |p| -|w|. More generally (b), the point p rotates 
in a circle of radius |p|-|w|-sin 6, where @ is the angle between w and p. Thus, 
Xp is Just the geometrically defined cross-product w x p. 





For an analytic determination of Xp = Bp’(0), we note that since the Bp(t) 
are all orthogonal, and Bp(0) = J, the derivative Bp’(0) 1s skew-adjoint, with a 
skew-symmetric matrix M, which we will write in the form 


0 —W3 W2 
M = W3 0 —W1 
—-W2 QO] 0 


Then the vector Xp is the 3-tuple whose transpose Xp‘ is given by 


Xp’ = M - (px, Po, p3)' 


0 —W3 W2 
=| 03 0 a, |-(p1, po, p3)' 
—-W2 OO; 0 


(— p203 + p3@2, —p3@1 + p13, —P1@2 + pra)". 


Conservation Laws 83 


Setting ®@ = (@1,@2, 3), we then have Xp = » x p. Moreover, @ is easy to 
identify, because 


By(t)=wforalt = 0=Xy=oxw, 


so ® must be a multiple of w, and it easy to check, by considering some specially 
chosen vector, that in fact» = w. Thus, one might say that the cross-product x 
is special to R* because n = 3 is the only dimension where O(n) has dimen- 
sion n. More to the point, we have 


PROPOSITION. The vector fields in R? generated by rotations about an axis 


are of the form p + @ xp for w € R?. 


For a particle c with velocity vector v we can consider the function c x v 
from R to R?, which is called the angular velocity of the particle; if c(t) = 
(x(t), y(t), z(t)) for functions x, y, and z, then the angular velocity of c is 


(Ac) (yz’—y'z, x'z—xz', xy’—x’y). 


For a particle whose mass is m, the cross-product L = c Xx mv 1s called its 
angular momentum. ‘The angular velocity and momentum just defined are 
“with respect to the origin 0”: for any other point P, the angular momentum 
with respect to P is the cross-product 


Lp =(c—P)xmyvy. 


For a system of particles (c1,...,¢K) we define the angular momentum L of 
the system, with respect to 0, as 


K 
L= ) Ci XM; Vi; 


1=1 


here it is naturally necessary to consider all cj; <x mjv; as vectors at a single 
point, rather than as tangent vectors at different points. Note that the equation 
| ie yo (ci x mjc;’)’ reduces to 


K 
(L’) L’=) ci xmic;”. 


r=] 


84 Chapter 5 


More generally, we define the angular momentum Lp with respect to P 
as Lp = yy (ci — P) x m;v;. In particular, suppose we take P to be the 
center of mass C of the system (this means that we may be considering the 
angular momentum with respect to different points at different times). Letting 
M = >", m;, the “mass” of the particle C, we then have 


So mic xv = Yo mi(ci —C)xv; + > miC xX V; 
i i i 


= be +[C x (°; mivi)| 
=Lce+[C x MC’, 


so that we can write 
L=Lc+(C x MC’). 


The vector Lc, the angular momentum with respect to the center of mass, 1s 
also called the “rotational angular momentum”, so our equation says that the 
total angular momentum L is the sum of the rotational angular momentum Lc 
and the angular momentum of the center of mass with respect to 0. 
If instead of a momentum vector we consider an arbitrary force F at a point c, 
the cross-product 
t=cxF 


is called the torque of the force with respect to 0, while tp = (c — P) x F is the 
torque with respect to P. (Although I have used the physicists’ L for angular 
momentum, I couldn’t bring myself to use the standard N for torque.) 

Similarly, we define the torque of a system of forces on a system of particles; 
here it is again necessary to consider the individual torques as being vectors 
at one point, even though we naturally think of the forces as being applied at 
different points. 


3. PROPOSITION (ANGULAR MOMENTUM LAW). If our system satis- 
fies the strong form of the third law, then the total torque is the derivative of 
the total angular momentum, 
ca, 
PROOF. We have 
LY oe x MiCj” by equation (L’) 


= ye x F; 
I 

= ae xFP +) > ci x Bj; 
I H J 


Conservation Laws 85 


=T +> oc x Fi;. 
Lg 


The strong form of the third law allows us to write 
Fi; = Aij(ci — cy), 


with ij = Aji; so we have 


) ) Ci x Fj = ) ) Nij [Ci X Ci oo ci x cyl, 
Z to 


i 
which vanishes, since cj x c; = 0, while c; x cj = —cj x cj and Ai; = Aji. > 
Easy manipulations give us the more general 


4. COROLLARY. For any point P, 


tp = Lp’ 


In particular, of course, if the torque is 0, then angular momentum 1s con- 
served. This certainly happens in the special case of a single particle moving 
under a central force, where the external force F is a multiple of c, so that 
t =c xX F =0. This was noted by Newton as the first Corollary of his Propo- 
sition | (page 55): 


Coro ary |. In nonresisting spaces, the velocity of a body attracted 
to an immobue center is inversely as the perpendicular dropped from 
that center to the straight line which is tangent to the orbit. 


As we saw in Chapter 2, the particle actually stays in a plane, and if we have 
c(t) = (x(t), y(t), 0), say, then conservation of angular momentum just says, by 
equation (A,), that xy’— yx’ is constant. Even the somewhat more general rule 
that angular momentum is conserved in the absence of external forces was not 
stated until quite some time afterwards, and this law was known for a long time 
simply as “the law of areas”, or Flachensatz in German. 

‘The evocative term “torque” (from the Latin torquere, to twist) was not intro- 
duced until the 19" century. Before that, the cross-product c x F was called 
the moment of the force F at the point c, with respect to 0. Here “moment” is 
being used in the sense of “importance” or “significance” (e.g., a matter of great 
moment), this significance having been noted long before in terms of the law of 


86 Chapter 3 


the lever. Correspondingly, angular momentum was known as the “moment of 
momentum”, a term which has not yet been totally expunged. 

A standard elementary illustration of the law of conservation of angular mo- 
mentum is provided by a person seated on a rotating stool with arms extended 
out holding weights, and then increasing the speed of the spin, often quite dra- 


—_— 
si 


matically, simply by pulling the weights ward. Similarly, ice-skaters speed up 
their turns by pulling their arms in; divers, starting their dive with a small an- 
gular momentum, do rapid somersaults by pulling their arms and knees mn; and 
gymnasts do all sorts of tricks. 

By the way, without appealing to conservation of angular momentum we can 
explain the speed-up as a simple consequence of the parallelogram rule for 
forces, or even for velocities: the sum of the velocity v that the weight already 
has and the velocity w that it acquires as a result of the inward pull is the 


diagonal of the rectangle spanned by these two, and consequently has a greater 
length. 

In these examples, we merely altered the given non-zero angular momentum, 
but something interesting occurs even when we start with angular momentum 0. 
Moving the weights along a circle in one direction contributes a certain amount 
of angular momentum to the system of weights-plus-person, which must be 


‘\ 


O 


\. 


countered by an opposite amount of angular momentum 1m the system, so the 
seated person must rotate in the opposite direction. At the end of the motion, 


Conservation Laws 87 


when the weights are no longer being rotated, the person will have stopped 
rotating, but will be facing in a different direction; cats use this mechanism to 
land on their paws even when dropped from an upside-down position. 

In this respect, rotation 1s quite different from lmear motion. A system cannot 
change its position using only internal forces, and no external forces. On a 
perfectly frictionless ice surface you can change the direction in which you are 
facing, but you can’t move the position of your center of mass. (Of course, you 
can forcefully exhale, providing yourself with rocket propulsion, making use of 
the fact that the air inside your lungs is a part of your system that you aren’t 
attached to—or you could just throw your coat away.) 


The momentum law and the angular momentum law are the first two great 
conservation laws of mechanics, and they apply to all mechanical systems, al- 
though their application may also require further understanding about rigid 
bodies and other matters. We should also mention that they are vector equa- 
tions, so, for example, if the x-component of the total external force F is 0, 
then the x-component of the total momentum is constant; or to put it more 
generally, if the total external force F is 0 in one particular direction, then the 
total momentum in that direction is also constant. ‘This probably doesn’t seem 
particularly useful, but the analogue for angular momentum definitely can be 
(cf. Problem 5): if the the torque is some direction 1s 0, then the component of 
the angular momentum in that direction is constant. 


The third conservation law is quite different: it is both much more special 
and much more general. 


Conservation of energy: kinetic and potential energy. As Galileo had noted, 
for a body falling under the acceleration of gravity, the distance s that it travels 
after being released from rest satisfies s = at* for some a, so that v = s’ = 2at 
satisfies uv? = constant - s, expressing v in terms of the distance traveled, rather 
than in terms of the time traveled. 

More generally, suppose a body falls due to a force that depends only on its 
height x from the earth’s surface, 


mx" (t) = —f(x(t)), 


for some function f; as in Problem 2-6, we add the — sign so that a positive f 
corresponds to an attractive force towards the earth’s surface, the direction in 
which x decreases. Although we may not be able to solve for x explicitly, we can 
still get out information about v = x’. We use the obvious trick of multiplying 
both sides of the above equation by x"(t), so that the right side becomes a 
derivative, 


mx'(t)x"(t) = —f(x(t))-x'(t) =—-(Foxy(t) for F’= f, 


88 Chapter 3 


and then observe that the left side 1s also a derivative, so that we get 


(4mx'?)’ =-—(F ox)’. 


The quantity T = Smv* is called the kinetic energy of the body, so if we let 


x; = x(t;) fori = 0,1 and v; = v(t;) = x'(t;), we have 
(x) T (t1) — T(to) = $mv\* — 4mvo* = —F (x1) + F(x0), 


so that the difference T(t;) — T(to) of the kinetic energy at two times depends 
only on the heights at the two times. 

As an aside, we point out that an alternative approach is suggested if we know 
that we are looking for an expression for v in terms of x. As on page 75, we use 
Leibnizian notation to transform mv'(t) = — f(x(t)) into a formula for du/dx: 


dv dv [dx _ —f(x) 





dx  adt/ dt mv 


dv 1. dv? 


ea ae 


tmv? = f f(x) dx. 


We've been discussing a one-dimensional situation, or equivalently one in 
which our force always points in one direction, but the same conclusion holds 
for the more general case of a radially symmetric central force. We introduce 
polar coordinates (r, @) for the plane in which the motion takes place, and let r 
be the unit vector field pointing toward the origin, while @ is the perpendicular 
unit vector field. At any pomt x our radially symmetric central force has the 
value —f(|x|)r for some function f. If for convenience we mtroduce the usual 
“abuse of notation” of allowing r to stand for roc and @ to stand for 6 oc, then 


+m(v-)’ = 4m- (v,v)’ =m(v,v’) = (v,mv’) 

(v, —(f or)r) 

=(rr+0'0, —(f or)r) 

= —(f or)r’ =—-(F or)’ for F’ = f. 


We thus have 


(+) T(t1) — T(to) = —F(r(4)) + F(r(to)), 


Conservation Laws 89 


so that the difference in kinetic energy at two times depends only on the dis- 
tances from the origin at the two times. 
Nowadays this result is usually stated rather differently. If we let V: R? > R 
be the function 
VSP or, 


then (**) becomes 


T(t1) — T(to) = —F(r(c(t1))) + F(r(c(@o))) 
= —V(c(t)) + Vic(to)), 
and if we choose a fixed fg we find that 
(>) T(t) + V(e(t)) 1s constant. 


The quantity V(p) is called the potential energy of the particle at p € R°, so 
equation (***), called conservation of energy, states that for radially symmetric 
central forces the sum of the kinetic energy and the potential energy of a par- 
ticle is constant throughout its path. The important point here is that V(c(¢)) 
depends only on the position c(t) of the particle, not on the path c itself. 

Obviously, the function V is only determined up to a constant. For elemen- 
tary problems involving free falling bodies near the surface of the earth, it 1s 
customary to consider V to be 0 on the earth’s surface, so that its value when 
the body is released from some height is positive. As the body falls, its potential 
energy decreases as its kinetic energy increases. ‘This accords with the usual 
interpretation of V as the kinetic energy that the body “potentially” has, Le., 
the kinetic energy that it can acquire by being released, and allowed to fall to 
earth. 

In the case of an inverse square force, a body falling radially toward the center, 
with distance r(t) from the center given by 


rt) = —K/r(t)’, 


has 
K 
v(t) =r'(t)=K/r => V(p)= er; + constant. 


It is convenient to take the constant to be 0, so that V is 0 at oo. For a planet 
moving in an ellipse, V is larger (though negative) at points further from the 
sun, so the kinetic energy 1s smaller there (as implied by Kepler’s second law). 


small v large v 


90 Chapter 3 


More generally, we define a force F = (F1, F2, F3) to be conservative, with 
potential energy function V, if the conservation of energy equation 


(C) sm(v(t), v(t)) + V(c(t)) = constant 


holds for all particles c(t) moving under the force F. For the standard coordinate 
system (x!, x7, x*) on R?, differentiating gives 


oe 
= (v(), mv’) + Ve) 


(c(t) -ci'(). 





3 
0 
= (v(t), F(c(t))) + » = 
i=l 


Choosing a path c with c(0) = p, and evaluating at t = 0, we obtain 
3 


0 = (v(0),F(p)) + > a “(p)- ci'(0). 


Since, under the standard identification of tangent vectors of R? with R? itself, 
we also have 


v(0) = (c1'(0), c2'(0), ¢3'(0)), 


we see, by choosing c with only one c;"(0) 4 0, that we must have 


OV dV oV 
(C’) = (Fi. FB) =-(55 weeny a 


dx)” 0x?" ax? 
which physicists usually write as 
F = — grad V. 
Equivalently, 


0 0 
(F ai} = =-5(V), 


and thus, more generally, for any tangent vector v we have 
(F, v) = =A V)), 


for the usual operation of a tangent vector v on a function. 


Conservation Laws 91 


Conversely, suppose that F satisfies (C’). For any field F = (F), Fo, F3) we 
have 


T (ti) — T (to) 


| 
— 
< 
on 
~ 
we 
ny 
a 
SS) 
a 
“= 
— 
— 
atl 
Q 
~ 


and if we introduce the l-form 
w = Fy dx! + Fy dx* + F3 dx? 


and let y be the curve y = c|[fo, t;], this can be written as 


(#) Tin) -T(0) = | w. 


y 


Now if F satisfies (C’), then @ = —dV, so we have 


T (t1) — T (to) =| -av 


Y 
= —V(e(t1)) + Vie(to)), 


which implies that F is conservative, with potential function V. 
The quantity 


fy 

| (v(t), F(c(t))) dt = [e 
to y 
is called the work done by the force F on the particle c as it moves along the 
path y. We have just seen that for conservative forces this depends only on the 
end-points of the path. As a simple example, consider a closed elliptical path, 
on a time interval [0, To], of a particle moving under an inverse square force F. 
The total work done by F along this path must be 0, since that is the total work 
done on the interval [0, 0], which has the same end-pomts. 

In general, of course, there usually won’t be more than one trajectory between 
two points. The more interesting situation—and the one that connects with our 
every-day notion of work— arises when we consider the work done by a force as 
we move it along some other path, 1.e., the work that has to be done against the 
force field F in order to move a particle from one point to another. For example, 
raising a particle of mass m from height ho to height h; near the earth’s surface 


92 Chapter 3 


requires a total work of mg(h; — ho), no matter what path we take, provided 
that periods during which the particle is moving downward instead of upwards 
are regarded as contributing negative work. 

This property of conservative forces is usually regarded as the more important 
one in physics books, so they define a force F to be conservative iff, y ® depends 
only on the end-points of y. ‘This immediately implies our previous definition, 


since we can define 
V(p) = / e 
y 


where y is any path from a fixed point po to p, and equation (*) on page 91 
immediately leads to the conservation of energy equation! 


(C) +(v(t), v(t)) + V(c(t)) = constant. 


Looking at the calculation for equation (**) on page 88 we see that it still 
holds if we replace F by F + F; where Fy is always perpendicular to v; as 
physicists would express it, the work done by the extra force F; is 0, by the very 
hypothesis that it is always perpendicular to v. 

In particular, instead of a body moving under the gravitational force of the 
earth, consider one suspended by a thread, so that we have a pendulum bob, 
which we have analyzed (somewhat informally) in Problem 1-17. The total force 





on the pendulum bob is then F = F — F), where F is the conservative force 
from the gravitation field of the earth, while F, 1s the force that was introduced 
in our analysis, pointing along the thread. Since F; poimts along the thread, 
(v, F;) = 0, so we still have the conservation of energy (C). 


| For some strange reason, physics books (and even mathematics books) usually eschew 
this simple direct argument, instead noting that the dependence of / w only on the 
end-points of y imphes that ie w = 0 for all closed y, so that for all 2-chains o we have, 


by Stokes’ theorem, 0 = iF gOreuh > 2, and thus we must have dw = 0. This implies 
(with proper conditions on the region where F is defined) that a = —dV for some V. 


Conservation Laws 93 


‘This has the interesting consequence that although we cannot explicitly solve 
the pendulum equation derived on page 47, we can still say what the speed v 





of the bob is at any height h, because we have 


mgh + mu 


= constant, 

so we just have to know the height ho at which we released the bob, with v = 0 
(compare Problem 1-17). The pendulum can be regarded as a mechanism that 
is continually interchanging potential energy and kinetic energy. At the top of 
the swing the kinetic energy is 0, while at the bottom of the swing, the difference 
in potential energy has been converted to kinetic energy, just sufficient to raise 
it up to the same height at which it started. 

Sumilarly, on page 30 we mentioned the usual elementary analysis of a block 
sliding down an inclined plane, where we assume that the block is acted upon 
by the force of gravity F and another force —F,; perpendicular to the inclined 
plane. Thus this argument works in that case also, and the kinetic energy $mv? 
at the bottom must again be gh. So the speed of the block when it reaches the 
bottom must be the same as if it fell straight down, which agrees with our 
calculations, since the block’s acceleration along the inclined plane is only sina 
of its falling acceleration, but it has 1/ sina as far to go. In the same way, instead 
of a pendulum bob hanging from a string, we could just as well allow the bob 
to slide along a plane with a circular profile, or indeed any profile, if we could 
really provide a frictionless surface. 

Aside from its obvious physical interest, conservation of energy is important 
mathematically as a “first integral” of the laws of motion, 1.e., an equation 
involving only first derivatives, rather than the second derivatives that appear 
in Newton’s laws—all harking back to our original trick on page 87. In the 
next chapter, we will derive the result of Problem 2-6 in a more systematic way, 
starting from conservation of energy, with the sign of the total energy F of an 
orbit turning out to have a simple geometric significance. 


94 Chapter 3 


Conservation of energy in collisions. While the role of kinetic energy with re- 
spect to conservative forces seems fairly straightforward, there was initially con- 
siderable confusion about kinetic energy because of the completely different 
role that it plays in that simplest, yet most essential, physical phenomenon, the 
collision of two bodies. 

Consider two particles, cy and cz, with masses m; and m2, moving along a 
straight line with velocities vj; and v2; as usual, since our motion 1s confined 
to a straight line, we can represent the velocities sumply by numbers. It seems 
natural to ask the question: if they collide, what are their new velocities wy 
and w> after the collision? 

Conservation of momentum gives us only one equation, 


(1) MW, +M2wW2 = m1, + M02, 


for the two unknowns w, and wz, so it obviously can’t determine an answer to 
the question, even under special circumstances, like the case where m, = m2 
and v2 = 0, so that we have a moving object colliding with a stationary one 
of the same mass. One possible solution would be w; = 0 and w2 = v1, so 
that the first body stops and imparts all its motion to the second (something 
close to this happens when two steel balls collide). On the other hand, the 
second body might be “soft”, like clay, so that it yields on impact, losing its 
shape and adhering to the first body, with the two then moving together as one 
(alternatively, we might consider bodies that wil stick after contact because of 
glue, as on page 26, or perhaps carts with couplings that cause them to move 
as one after an impact), and im this case the final velocity of the two bodies will 
simply be v1 /2, just another possible solution of the infinitely many. 
Elementary physics textbooks need to provide problems that have answers, 
of course, so, in the manner of a host nonchalantly introducing a celebrity at a 
party, they will often unobtrusively insert a new definition: a collision 1s called 
“completely elastic”, if we also have conservation of kinetic energy, 


2 2 2 2 
mMiW, +M2wW2 = mMyjv1" + M202 


(2) or 
2 2 2 2 
m(wy* — U1") = m2(v2* — w2") 
Consorting with this new definition we have a contrasting one: a collision is 
“completely inelastic” 1f w1 = w2 (the two bodies stick together). 
Once we’ve made a definition, it’s possible to pose all sorts of srmple problems 


about collisions that are “completely elastic”, whatever that might mean. In 
general, writing (1) in the from 


m (v1 — W1) = M2(W2 — V2), 


Conservation Laws 95 


and dividing into [the second form of] (2), which is permissible so long as we 
don’t have w; = v; (and w2 = v2), we get 


(3) W, — W2 = —(v1 = v2), 


and solving (1) and (3) for the unknowns w, and w2 gives 


m,—mM>) 2m 
wy = ——- v; + ———12 
m,;+m2 m, +mM2 
(*) 
2m, m,—m2 
SL 
m,;+m2 m, +m2 


the other solution, wy = v1 and wz = v2, 1s discarded on physical grounds, 
since it represents the particles moving through each other. 

This rather unenlightening formula will appear much more natural when we 
express it in terms of “center of mass coordinates” (Problem 10). In any event, 
we obtain the usual obvious cases: if m; = m2 and v, = —v2, so that we have 
two particles of equal mass approaching each other with opposite velocities, we 
get Wy = v2 and w2 = vj, so the two particles rebound with the same speeds 
at which they collided; if m, = mz and v2 = 0, then w; = 0 and w2 = v4, so 
the first particle comes to a stop, while the second proceeds with the velocity of 
the first. 

For general collisions, a coefficient of restitution e 1s sometimes defined by 

(wi — w2) = —e(v1 — V2) or e= ca 
U1 — U2 
which experimentally seems to be (somewhat) independent of the initial veloc- 
ities vy and v2. This is usually applied only when the two objects are moving 
towards each other and rebound in opposite directions after the collision, so 
that if we have, for example, v; > 0 and v2 < 0, then also w; < 0 and w2 > 0, 
which means that e > 0. 

But having a definition of the coefficient of restitution hardly tell us anything; 
it simply give us a way of specifying how far experimental results differ from 
the theoretical ones that we obtain from our ad hoc definition of completely 
elastic collisions. We would lke to understand why the modern definition of a 
completely elastic collision amounts to an idealization of the concept that lurks 
in the back of our minds when we think of an “elastic” body as one that pops 
back into shape after being squashed in a collision. 

To simplify things, let’s start by considering collisions of one body with a wall, 
whose mass may be regarded as so large that we don’t have to worry about its 


96 Chapter 3 


motion. First we take some nice modeling clay, form it into a ball, and hurl it 
at the wall, where it sticks in some deformed shape. ‘This is clearly an example 
of a “completely inelastic” collision. ‘Then we throw a rubber ball at the wall. 
The rubber ball is also squashed when it hits the wall, but, unlike the clay, the 





compressed rubber ball restores itself to its old shape, and bounces back in the 
reverse direction. Of course, it never bounces back with quite the same speed, 
but the term “completely elastic” was meant to describe an idealization of this 
situation, where the ball ends up pushing itself back with the same amount of 
force that caused the compression in the first place, so that it bounces back 
with the same speed. In this case, of course, we do have conservation of kinetic 
energy. 

The general case of a “completely elastic” collision of two bodies, with ve- 
locities v; and v2 along a straight line can be treated in a similar way. In this 
case, both bodies are deformed, and this deformation will continue until the 
two bodies have the same velocity u, which, by conservation of momentum, 
must be 


M,V2 + M202 
m,; +m? 


(1) u= 


During the compression, the first body’s velocity will change from vq to u, so 
the compression will involve decreasing the velocity by the amount u — v}. 
Consequently, when it decompresses, its velocity is then zncreased from u by the 
amount u — v1, and of course the same reasoning applies to the second body. 
So the final velocities w; and w2 are given by 


Wy =ut+(u—v) = 2u-v, 


(2) 


WwW. = 2u — v9. 


Conservation Laws 97 


Using the value of u from (1), we easily find that myw 1? + myw2? = mv? + 
m2 027; in fact, substituting (1) into (2) gives exactly the equations (x) on page 95 
that we obtained by assuming conservation of kinetic energy. 

This entire discussion has been limited to “head-on” collisions, but Prob- 
lems 12 and 13 have some information about the more general case. 


Conservation of energy in general. Although “collisions” between atomic par- 
ticles may be completely elastic, this is virtually never the case for everyday 
collisions between objects, where we can only hope to come fairly close to com- 
plete elasticity with objects lke steel or ivory balls, and this is but one example 
where conservation of kinetic energy fails in general. A completely different 
example 1s illustrated by a rocket. Suppose that it is initially at rest, so that the 
initial kinetic energy is 0 (we assume that the rocket 1s in space away from any 
gravitational fields, so that there is no external force on the rocket, and thus no 
potential energy to consider). Once the rocket has expelled some fuel, so that it 
and the fuel are both moving, in opposite directions, the kinetic energy clearly 
isn’t 0, since the kinetic energies are non-negative numbers, and therefore can’t 
cancel out hike momentum. 

A similar phenomenon occurs when one stationary person shoots another 
with a rifle—the resulting motion of the bullet upsets conservation of kinetic 
energy (as well as the person being shot at). Since momentum is always 
conserved, the shooter experiences the recoil of the rifle, which is the nega- 
tive of the momentum that the bullet obtains, and which will be transferred 
to the target, hopefully outfitted with a “bullet-proof” vest—the violent effects 
obtained without such protection are only indirectly a measure of the bullet’s 
momentum, depending more on the fact that it 1s delivered to such a small area. 

Of course, nowadays we would say that the loss of kinetic energy involved 
in collisions is due to its dissipation as heat, that the increase in kinetic energy 
of the rocket is due to the conversion of chemical energy in the fuel, and that 
the increase in kinetic energy of the bullet 1s sumilarly due to the conversion of 
chemical energy in the gunpowder. In all cases, the total energy—when we add 
up the heat energy and the chemical energy, and all the other types of energy 
which go into modern physics—1s supposed to remain constant. 

To quote from Feynman [1], the law of conservation of energy 


. states that there 1s a certain quantity, which we call energy, that 
does not change in the manifold changes which nature undergoes. ... 
It is not a description of a mechanism, or anything concrete; it is Just 
a strange fact that we can calculate some number and when we finish 
watching nature go through her tricks and calculate the number again, 
it is the same. 


98 Chapter 3 


Feynman goes on to discuss this by means of an analogy which is both very 
instructive and very entertaining, but much too long to quote here, so you should 
go read, or re-read, it yourself. In fact, chapters 4 through 13 of Feynman []] 
may be regarded as a continuing exposition of the role that the concept of 
energy plays in physics. 

In summary, as far as mechanics is concerned, conservation of energy— 
kinetic plus potential—is an important principal for conservative forces, which 
are generally the ones we wish to consider. On the other hand, for more com- 
plicated phenomena, which involve other forms of “energy”, there are no such 
conservation laws; or, to put it another way, the conservation of energy involves 
factors which are basically outside the purview of mechanics itself: 

In this regard, we may consider once again “completely inelastic” collisions. 
In addition to the case ofa ball of clay hurled at a wall, or even at a more mobile 
object like a steel cube, we can also consider a completely inelastic collision 
between two rigid bodies that stick together because of couplings, or perhaps 


a Te ees ae 


glue on opposing surfaces. It is easy (Problem 15) to compute the energy lost in 
the collision; the result was first obtained by General Lazarus Carnot, the father 
of Sadi Carnot of thermodynamics fame, and has been dubbed the “Carnot 
energy loss” in Sommerfeld [2], mentioned on page 35. 

This energy loss presumably shows up in shock waves coursing through the 
(not really rigid) bodies, dissipating as heat and sound waves, and possibly in 
some sort of chemical reaction involving the glue. In a way, this is the exact 
opposite of the rocket, where a large kinetic energy evolves from none at all—in 
that case, totally because of a chemical reaction. 

A Carnot energy loss that we might judge to be fairly large produces only a 
small change in temperature, which might be difficult to observe experimentally. 
For a simple calculation, consider two iron cubes each with a mass of 1 kilogram, 
which smash together after moving toward each other, each with speeds of 
1 meter/second. ‘The energy loss would then be 


m2 


by definition of the Joule. ‘The specific heat of iron 1s 





45 2 
g°C 


Conservation Laws 99 


where °C denotes degrees centigrade, so 1 Joule raises the temperature of 
1 gram of iron by .45°C, and our two iron cubes, of total mass 2 kilograms, 
would have their temperature raised by .45/2000 degrees centigrade. If instead 
of a speed of 1 m/s, which is only 3.6 km/hr, we chose speeds 30 times as 
fast, namely 108 km/hr, or roughly 67 miles/hy, then the temperature increase 
would be 900 times as great, or roughly .2 degrees centigrade; of course, this re- 
sult holds just as well for two iron cubes of arbitrary masses that end up moving 
as one. 

In mechanics problems we naturally do not expect to determine exactly how 
such energy losses occur in order to understand the underlying mechanical 
principles. But this circumstance provides a convenient rug under which all 
sorts of mysterious energy losses can be swept; a classic example is discussed in 


the following Addendum. 


100 Chapter 3 


ADDENDUM 3A 


WHIPS AND CHAINS 
(Why Easy Physics is So Hard: II) 


The progenitor of all those horrible “variable mass” problems introduced in 
Addendum 1A, which have been used to torment generations of physics students 
ever since, was a paper by a mathematician, Cayley [1], that begins: “There 
are a class of dynamical problems which, so far as | am aware, have not been 
considered in a general manner. The problems referred to ... are those in 
which the system 1s continually taking into connexion with itself particles of 
infinitesrmal mass ... For instance, a problem of the sort arises when a portion 
of a heavy chain hangs over the edge of a table, the remainder of the chain 
being coiled or heaped up close to the edge of the table ... ” (presumably an 
idealized case of a fine chain with very small links, as in Problem 1-13; notice 
that in Problem |-13 the entire chain is always being pulled, so Cayley’s problem 
is quite different). 

We can apply the equations (**) and (**’) on page 35, where our variable 
mass is the part of the chain hanging over the table, with the additional links 
added as the chain falls; since these additional links are initially at rest, the 
velocity at which they are added, relative to the falling chain, is —v, so this is 
again a case where v + q = 0, and our equation is simply 


F(t) = (mv)’(t). 


Taking the uniform density of the chain to be | for convenience, if x(t) is the 
length of chain hanging over the table at time ¢, then m(t) = x(t), while F(t) 
has magnitude gx(t). Thus our equation becomes simply 


ee SOx). 


If we set y = xx’, so that 
dy 


ae 


and then use the good old-fashioned trick of writing 


eee Allie 


dy _ dy dx 

dt dx at’ 
we obtain 

dy dx dy _, 


on dx dt dx 


Whips and Chains 101 


so that d a 
2 eG cE A em 
ee Se de 
or 
gx* dx = y dy. 
Thus we have 2 ; 
x y 
Se A eee 
g3 + 7? 
and if we assume the initial condition x’(0) = 0, we get A = —ga?/3, where a 
is the initial length hanging over the table. ‘This leads to 
2g a? 
l ee, 
0 wy = “£(x-S) 


Cayley considered only the special case a = 0 (which means that the whole 
chain is initially on the table, so that it shouldn’t fall at all, but presumably he 
assumed that the result would be a good approximation to the case where a is 


small). Then (1) becomes x’ = 29/3 /x, and integrating x’/./x = /2g/3 
yields 
2/x(t) = V2¢g/3-t, or x(t) = (1/6)gt?. 


Consequently, the chain is always falling with 1/3 the acceleration of a freely 
falling chain; of course, that applies only until the chain has cleared the table, 
after which its acceleration must simply be g. 

At first sight this seems rather weird and unlikely, and we might suspect that 
it is an artifact of the strange choice a = 0. Cayley may have taken this case 
because he didn’t want to deal with the general equation (1), which would re- 
quire an elliptic integral. But, without solving explicitly, we can still compute 


from (1) that 
2x!x" =) = (« 4 eS) | 





x3 


3 
ane 2a 
= —f{ ] —— |. 
e R(1+ 55] 


At the beginning (x = a) this has the value g, and it then decreases, approaching 
g/3 for long chains, with the same sort of discontinuity as before. (The equation 
for x” follows from the previous equation when x’ # 0, (1e., x # a), and then 
for x = a by continuity, although technically we must appeal to an elementary 
calculus theorem.!) 


so that 


‘If f is continuous at a and lim,-+g f’(x) exists, then f’(a) exists and = limy—+g f’(x). 
See, e.g, Spivak [I], Theorem 11-7. 


102 Chapter 3. Addendum 3A 


‘To make sense of this perplexing answer, which of course is only approximate 
for the case of an actual chain, it helps to note that each time another link is 
added to the falling chain, that link is suddenly yanked from velocity 0 to the 
velocity of the falling chain, and the resultant increase in momentum must be 
balanced by a decrease in momentum of the falling chain. As the falling chain 
gets longer and more massive, one might expect the effect to be less noticeable, 
but the longer falling chain also has a much greater velocity, so the momentum 
added to the next link also increases greatly. 

This problem has appeared in many standard mechanics books—usually with 
Cayley’s solution, though Sommerfeld [2] hints at the more general solution— 
not so much for its own sake, but in order to examine the question of conser- 
vation of energy. 

When a piece of chain of length x is hanging over the table, the potential 
energy has decreased by f* gudu = }.g(x? —a?), while the kinetic energy has 
increased by x(x! )*, so that the total change of energy is 


AE = $x(x')* —4g(x? —a’). 


By (1) this is 
3 


a 
ley? 4 1eA? 2 le 
soe Be 38 


so for all x > a it is negative, and increases rapidly as x gets large. It is hardly 
surprising that conservation of energy does not hold for this solution, since, as 
Problem 23 shows, 1t does hold for the solution to Problem 1-13. 

This loss of energy 1s explained in terms of the Carnot energy losses each 
time the dangling chain pulls another link off the table, the point being that this 
is a completely inelastic collision, since the resulting velocities of the two bodies 
are the same; the energy loss presumably ends up heating the chain. 

It actually turns out to be rather difficult to conduct experiments to check our 
answer, because one can’t get a real chain “coiled or heaped up close to the edge 
of the table” in such a way that each link is right at the table edge, ready to be 
added to the falling part; in practice, there is an unpredictable jumble as individ- 
ual links are released. One way to simulate the conditions of the problem might 


be to lay the additional links along a table made of slats, removing each left- 
most slat once the link lying on it has been pulled off. An inexact, but instructive 


Whips and Chains 103 


approximation could be provided by a carnival ride, the “whip”, where individ- 
ual cars are lined up close to the edge of a slide, but attached with bunched-up 
50’ chains. After the first car is allowed to start sliding, the experience of its 





riders 1s quite different from that of riders in the last car!) 
There is another classical problem for which experiments can more easily 
be carried out. Consider a folded chain, initially hanging by supports at both 


O 


ends, and then released at one end. After the chain has finished falling, its total 
energy (all in the form of potential energy) is much smaller. 

The classical description of this situation is that each of the links on the free 
end of the chain just falls with acceleration g until being jerked to a stop, with 
the loss of energy being accounted for by the corresponding Carnot energy 
losses. But a simple experiment shows that the acceleration must be consider- 
ably greater than g. It involves only a moderately heavy chain, a single link 
from such a chain, and two pieces of window glass. The specific chain used in 
one experiment was 5 feet long ( 152 cm), with 50 links, and weighed about 
14.3 oz (= 405 gm). The thickness of the glass was 3/16 inches ( .5 cm). 


104 Chapter 3. Addendum 3A 


Opposite ends of one piece of glass were placed on rests of the same height 
(two copies of a book), and the single link was repeatedly dropped onto it from 
a height greater than 5 feet, with no apparent ill effect (sometimes the link was 
initially held horizontally, sometimes vertically). ‘The piece of glass was replaced 


= 


— 


with the second, fresh, piece, and the 5 foot long chain was secured so that it 
hung with only the last link touching the glass. ‘The free end was then raised to 
the same height as the secured end (a short distance away from it horizontally, 
so that the chain wouldn’t become entangled in itself as it fell) and released. 
The result was a dramatic shattering of the glass plate. 

We can analyze the fall of the chain in this problem in the same way as 
the Cayley problem, taking as our body with variable mass the falling part of 
the chain, which is “losing” links to the fixed part. Since these links become 
stationary as they join the fixed part of the chain, we again have q = —Vv, so we 
still have the case where v + q = 0, leading to the same equation 


F(t) = (mv)’(t). 


As before, we assume that the chain has uniform density 1. It will also be 
convenient to assume that the fully extended chain is hanging so that it just 
touches the ground, and then let x be the height of the free end of the chain, 
initially having the value xo (thus, x9 = L for a folded chain of length L with 
both ends initially at the same height). The falling part of the chain has length 
x /2, so our equation becomes 


x 1 ee 
SS SY 
g aA ys 


Z 
or simply gx = —(xx’)’. Setting y = xx’ we now have 
dy 
X — o=—— 
7 dt 
dy 
— ry X ; 


Whips and Chains 105 


so that 
ee en EN ec aa 
dx axe Fs 
and hence F ; 
X y 
—+A=>—+., 
wa 2 


At t = 0 we have x = xo and x’ = 0, so we get A = —gxo°/3, leading to 


2 Xo" 
V2 
= —g|—~-x 
oe se (* } 


Mo & 2x9° 
Xx =-£ (=F +1), 


the second equation following by differentiation of the first, as before. ‘Thus the 
downward acceleration starts at g and then increases, so the released end of the 
chain falls faster than a freely falling chain. 

This increase in acceleration can be explained by considering a link L of the 
chain that has just reached the bottom, as in (a). ‘This link has acquired a large 
velocity, but is now going to be stopped dead in its descent by the part of the 


(A) 


e e 
fixed end 
falling end 
—> 


FS ae 


(a) (b) (c) 


chain on the fixed end, and all that momentum will be used to yank the link 
around by 180°, as shown magnified in (b) and (c) of the figure. This yanking 
is going to pull the falling part of the chain even faster. This is basically just the 
opposite of what happens for the Cayley problem, where the falling chain yanks 
the next link off the table, resulting in the falling chain having its acceleration 


106 Chapter 3. Addendum 3A 


reduced; in the current situation the link that was falling, but now becomes part 
of the fixed chain, yanks the falling chain, resulting in the fallmg chain having 
its acceleration increased. 


The fact that the falling chain has acceleration greater than g seems to have 
been first observed by Calkin and March [I]. To explain the results of their ex- 
periments (rather more sophisticated than the shattered glass experiment), they 
completely jettisoned the question of Carnot energy losses, and simply assumed 
that conservation of energy holds, obtaining the equations (Problem 24) 


2 
(x’)? = g (= -x) 
XxX 
Z 
nn 8§& { X0 
> & i). 


which seem to agree quite well with their experimental results. (In the case of 
this solution, the increase in velocity is easily explained by the fact that the same 
amount of energy has to be concentrated in shorter and shorter pieces of chain, 
so that the velocities must increase.) 

In the figure below, comparing the downward speeds for equations (A) and (B), 
the direction of the x-axis is reversed, so that x = xo, at time ¢t = 0, appears 
on the left, while x = 0, at the end of the fall, appears on the right. I do not 


(B) 





Xo (t = 0) 0 (end of fall) 


know how well the Calkin-March data would match up with equations (A) [for 
an actual chain, of only finitely many links, either set of equations becomes less 
reliable near the end of the fall, which is where the solutions diverge the most], 
or how to choose between them, or whether the solution for a real chain is some 
sort of compromise between the two. Or, for that matter, how one should treat 
the same problem when the chain 1s replaced by a rope. 

Note that, with either solution, at the end of the fall the speed and acceleration 
actually become infinite, or at any rate very large for an actual chain of only 
finitely many links, where the friction between links also takes its toll. ‘This 
possibly counter-intuitive behavior is also demonstrated by the crack of a whip; 


Whips and Chains 107 


here the force applied to the whip takes the place of gravity, and the crack of 
the whip is a shock wave caused by the very large velocity with which the end 
of the whip is traveling. 


A more recent paper examining these questions, Wong and Yasui [1], ap- 
proves of the Calkin-March solution, dismisses Cayley’s solution of his problem, 
and by extension the one given here, in favor of a conservation of energy so- 
lution (Problem 25), and goes on to discuss the folded chain problem in great 
detail. This paper contains an extensive bibliography of previous solutions to 
both the folded chain problem and the Cayley problem, which may be very 
instructive to peruse. But it appears to me that all the conclusions of the paper 
itself are wrong, 


Undoubtedly others will find that all the conclusions in this Addendum are 
wrong, 


I must admit to being totally confused. I thought mechanics 
was a cookbook subject where one uses afew basic principles 
to translate physics into mathematics, and then revs up the 
calculus machine and grinds out the answer. I guess your book 
1s intended to cure those of us who have this misapprehension. 


—An eminent mathematician 


108 Chapter 3 


ADDENDUM 3B 


FOLLOW ‘THE BOUNCING BALL 
(Why Easy Physics is So Hard: II) 


I was fortunate enough to be a graduate student at Princeton when the math- 
ematics department was still ensconced in the original Fine Hall, whose com- 
forting common room, with its antique appointments and fireplace bearing 
Einstein’s famous words “Raffiniert ist der Herr Gott, aber boshaft ist Er nicht” 
(God is subtle, but not malicious), was often equally a meeting place for physi- 
cists from the adjoining building. 

At one point this led to a lively discussion of a demonstration often included in 
mechanics courses, involving a series of balls suspended from cords so that they 
just touch. If one ball is raised on the left, and allowed to strike the remaining 


Cu — US 


balls, the right-most ball is observed to swing out to the same height, while if two 
balls a raised together on the left, the right-most two balls swing out together 


CO —> COO =———”" COO CO 


on the right. And if two balls are raised on the left while one is raised on the 
right, the final result seems to “interchange” the balls, with one ball flying off 


OCO—->CO+-QOQ s—_——- 0O<- OCO—- OO 


on the left and two balls flying off together on the right. 
The apparatus for such strikmg and amusing experiments has apparently 
been dubbed “Newton’s cradle”, a miniature version of which 1s often sold as a 





Follow the Bouncing Ball 109 


device to relieve an executive’s boredom, as well as an “educational toy’, since 
it supposedly illustrates the laws of conservation of momentum and energy. 

Of course, the results are merely conszstent with the laws of conservation of 
momentum and energy. Those two laws alone cannot possibly predict the results 
once there are more than two balls, since they give only two equations for three 
or more unknowns. ‘There are, of course, a few other mathematical conditions, 
expressing the fact that one ball can’t pass through another, but not enough 
to make the solution unique. In fact, to get another solution one need only 


OOD > Co 


consider, for example, the case where one ball collides with two balls that simply 
behave as a single ball of twice the mass. 
Similarly one can ask what would happen when a small ball strikes a large ball 


=e: 


that touches another small ball on the other side. ‘There were those who claimed 
that the results would be exactly the same as if the large ball were replaced by 
a series of small balls: the incoming small ball would come to a stop, while the 
other small ball would fly off with the same velocity. It was pointed out that this 
would be rather strange, since one wouldn’t expect that throwing a ball at the 
earth would cause it to stop dead while a ball at the antipodes would suddenly 
shoot up in the opposite direction! But that objection was rather disparaged, 
supposedly explained away by the fact that the earth and the balls were far from 
completely elastic. 

At this point, it was helpfully pointed out that the physics department kept 
a handy room full of experimental equipment, including ivory balls suspended 
from strings. We enthusiastically decided to do an experiment, much to the 
disdain of several physicists, who considered this a complete waste of time, as 
they had already told us what the result would be. ‘To our delight, we even 
found that the equipment included one large ivory ball. Although one passing 
physics professor warned us that we needed to know the speed of sound in 
ivory, we stubbornly proceeded merely to do the experiment and observe the 
results. Of course, as expected, the first ball bounces back quite a bit, rather 
than coming to a stop, allowing us to return to the common room triumphantly 
echoing Galileo’s rebellious words “But it does move!” 

Ever since that time, I have regarded explanations given in mechanics texts 
with a healthy skepticism—which has helped shape much of the material in 
Chapters 5 and 6. 


110 Chapter 3. Addendum 5B 


Mathematicians like to use the “epsilon-method” for solving this problem, 
considering what happens when the balls are separated by a small distance ¢, 
and letting ¢ — 0, but there’s no physical justification for assuming that this will 
give the right result for balls starting in contact. In fact, smce material spheres 
aren’t perfectly spherical, and positions aren’t precise, it’s not even clear what 
“in contact” really means—in practice, the balls wul actually be pressing against 
each other. 

Some models of Newton’s cradle do leave a slight gap between the balls, pre- 
sumably because this gives better results. In fact, careful experiments when 
the balls are in contact, Chapman [1], showed discrepancies from the theo- 
retical predictions larger than can be attributed to the slight deviation from 
complete elasticity, and led to several other investigations, e.g, Herrmann and 
Schmialzle [1], Herrmann and Seitz [1], and Auerbach []]. 

This is the sort of problem that cannot be solved by a simple application of 
Newton’s laws, treating each ball as a particle; we would have to consider all the 
individual molecules of the balls and their interacting forces, which would be a 
problem of impossible complexity, even if we actually knew these forces. Instead, 
we need a simplified model that gives good agreement with experiment. Not 
surprisingly, therefore, the question has received considerable attention from 
practitioners of “applied mechanics” or “mechanical engineering’, who really 
need to solve such problems. See, for example, Ceanga and Hurmuzlu []]. 


Conservation Laws 111 


PROBLEMS 


1. As an undergraduate math major, the hopelessness of understanding J. 
physics was borne down upon me every time I heard about some first- | 
year physics problem. One such typical problem, usually presented on the 
second or third day of class, concerns a monkey, let’s call him ‘Tantalus, 
who is climbing a rope passing over a pulley with a bunch of bananas of} 
exactly his weight attached to the other end. 
Problems of this sort always drove me bananas because I never under-| 
stood how one was supposed, on the basis of Newton’s laws for particles, : 
to divine the fact that the end of the rope attached to the bananas must 
be exerting exactly the same upward force that the other end of the rope 
exerts upwards on Tantalus (problems of this sort are considered briefly in § 
Chapter 6). 
But what really made a monkey out of me was the next step, concluding that 
Tantalus and the bananas rise at exactly the same rate. My mistake was that 
I kept trying to think about the mechanism of Tantalus climbing—just what 
happens when he lifts one arm to reach higher on the rope and then tugs on 
the rope (does it matter if he still holds on to the rope with the other hand)? 
Sometimes, thinking about physics problems just seems to make them harder. 
Actually, the standard answer, that ‘Tantalus and the bananas rise at exactly 
the same rate, is false—in fact, it 1s meaningless. What is the correct answer? 








‘SqUIT[D oY Sv sosuRYO ATeENUNUOD sseU Jo 19}U99 snyeULT, Inq “(jjo sje} eueUeG e pue 
Ayan] s}09 snypeyuey, ssoyun) sduvypd },ussop seuvUe 94) JO SseUL JO 19]U99 dy} ‘9SINOI IO 
‘seuvUeq 94} JO (Sse JO 19]U99 9Y}) se 9Je1 DUTIES DY} Ie SISTI sHTEIURT, JO ssp fo 1a]Ua DY T, 


2. Let Ai,...,Ax be K collections of particles, with C; the center of mass 
of A;. Show that the center of mass C of the collection A = A; U---U Ax is 
the same as the center of mass of the collection {C),..., Cx}. 


3. (a) If two particles c; and c2 satisfy the second law, F = m;v;’, as well as the 
Momentum Law F = )/, m;y;’, then the internal forces between them satisfy 
the third law. 

(b) If they also satisfy the Angular Momentum Law t = L’, then the internal 
forces satisfy the strong form of the third law. 

(c) Can similar conclusions be drawn for a system of more than two particles? 


4. (a) For a collection of particles c; the total angular momentum Lp 1s inde- 
pendent of P if and only if >°, miv; = 0. 

(b) For a collection of forces F; at c;, the total torsion tp 1s independent of P 
if and only if >), F; = 0. 


112 Chapter 5 





zs (=O. A “spherical pendulum” is just a pendulum that is not necessarily swinging 
in a plane, which can easily happen if the pendulum bob js given a push in 
some direction as it is released. The forces on the pendulum bob are the force 
of gravity downwards and the force exerted by the string, as on page 47. 


(a) ‘The vertical component of T 1s always 0. 

(b) If the path of the pendulum bob is c(t) = (x(t), y(t), z(¢)), then x’y — y’x 
is constant. 

(c) If the pendulum is ever perpendicular, passing over (0,0), then it is actually 
swinging in some vertical plane. 


6. (a) Consider a system of particles c1,...,cK with total mass M and center 
of mass C. If y; = c; — C, then 


T = +M\C'| alr 5 Me milyi'|?. 


(b) Consider inertial systems that differ only by the choice of origin. Show that 
the one having origin C 1s the one with the smallest kinetic energy. 


7. Let vp and vg be the velocities at some time of the endpoints P and Q of 


vo 
Q 


P wp 
a uniform rigid rod of mass m. Show that the kinetic energy of the rod at this 
time is 
T = lve? +\vol? + (ve, vo)} 


8. Assuming that the collision of two objects doesn’t increase the total kinetic 
energy, show that the coefficient of restitution e satisfies 0 < e < 1. 


9. In Chapter 1, we mentioned Huygens’ ingenious argument (page 26), based 
on the idea of examining a collision in two different coordinate systems, moving 
with uniform velocity with respect to each other. Huygens actually extended 
his argument in a strange and complicated way (see Dugas [1; pp. 177-180] and 
Mach [1]; pg. 403ff-]) that essentially assumed conservation of kinetic energy in 
collisions. 

Let v, and v2 be the initial velocities of two bodies, of masses m, and m2, and 
Ww, and wy? their velocities after a collision. Assuming conservation of kinetic 
energy we then have 


(1) m1(V1, V1) + m2(V2, V2) = m1 (Wi, W1) + m2(Wo, Wo). 


Conservation Laws 113 


Obtain the corresponding equation assuming that conservation of energy also 
holds in an intertial system moving with velocity u with respect to the original 
one, and conclude that 


m,{V¥1,U) + m2(V2,U) = m,(W1,U) + m2(W2,U). 
If this is true for all v, then we have 
MV, + M2V2 = MW, + M2W2; 


so conservation of kinetic energy in collisions implies conservation of momen- 
tum (provided we assume that the conservation of kinetic energy holds all iner- 
tial systems). 


10. (a) For two particles colliding along a straight line, if the total momentum 
of our two particle system 1s zero, we have 


mv, +mM2v2 =0 = > v2 = —(m,/m2)v1, 
and thus also 
mM1W1,+MoW2=0 =—> w= —(m,/m2)w}. 


Show that conservation of kinetic energy leads immediately to the solutions 
W1 = V1, W2 = v2 (ignored for physical reasons) and w; = —v1, w2 = —v2, so 
that after the collision the velocities are simply reversed. 

(b) For a system with no external forces, the center of mass has constant velocity, 
so we can choose an mertial system, the center of mass coordinates, in which 
the center of mass is the origin. Show that the total momentum in this inertial 
system is zero. Letting v; and w; denote the initial and final velocities in the 
center of mass coordinates, compute v; and w; in terms of the v; and w; and 
deduce (*) on page 95 from the interchange of speeds in the center of mass 
coordinates. 


11. For a particle of mass m, and velocity v1, in a totally elastic head-on colli- 
sion with a stationary particle of mz, use (*) to show that 


W2 2m 


vi m, +m 

Conclude that if m,; and v; are unknown, but m2 and w2 are known for two 
different stationary particles of different masses, then m; and v, can be deter- 
mined. In 1932 Chadwick used this method for certain unknown uncharged 
particles created in a nuclear reaction colliding with various nuclei of known 
masses, to determine that the mass of these particles (now known as neutrons) 
was practically equal to that of the proton. 


114 Chapter 5 





~ 12. Suppose that cy and cz are bodies of the same mass, with c2 initially at rest, 
so that they have initial velocities v; and vz = 0. Show that after a perfectly 
elastic collision their velocities w; and w2 are perpendicular. 


13. Now consider the general situation where two particles c; and cz, moving 
with velocities v; and v2 lying in a plane, collide and end up with velocities wy, 
and w2. As illustrated in part (b) of the figure, this is often applied to situations 


Coo 
CoA Vy, ee C2 Ay, 2s. on 


ma “ : C cs W2 
Cy cr Ne l 
(a) (b) 


where two particles don’t actually collide, but are deflected from a straight path 
for other reasons. For example, two positively charged particles initially far 
from each other might follow paths like this, a situation we will encounter in 


Addendum 4C. 


(a) The velocity of the center of mass C in our original coordinate system 1s 
Vo = (mM,V, + M2V2)/(m, + mz), and the velocities 


Vv; =Vvi-Vo 


Vy =V2—-Ve 


are negative multiples of each other, and similarly for wy and w,. 

(b) If the collision is perfectly elastic we have vf = wy, and vy = w 5. Thus 
the speeds are the same before and after the collision, and the velocity vectors 
simply rotate by an angle ©, known as the scattering angle. 





* 
Wo 


(a) general collision (b) perfectly elastic collision 


14. Many experiments involve a moving particle colliding with one that is ini- 
tially at rest in the laboratory. Addendum 4C describes one such experiment, 


Conservation Laws 115 


where the mass of the moving particle is very small compared to that of the sta- 
tionary particle, so that the center of mass coordinates are practically the same 
as the “laboratory coordinates”. In general, however, the observed “laboratory 
scattering angle” differs significantly from the scattering angle in the center of 
mass coordinates. 


(a) If the second particle is stationary, V2 = 0, then the velocity vc of the center 
of mass is parallel to v; with magnitude vc = m,v1/(m, + mz), and 


2 m2 
Vv; = ——— v1 
m, +m? ‘ 
=> vc /vy = m,/mM2. 
* —My 
Vv, = ———-v 
mM, +m? 


(b) If ©* is the scattering angle in the center of mass system, and © the scat- 
tering angle in the laboratory system, then we have 


* 2) * 
_ w, sin® 
OS 
vc + w; cos® 





If, moreover, the collision is completely elastic, so that wf = uj, then 


sin © 
tan Q = ———_______ 
m,/m2 + cos©* 


(c) Show that if m; < mz, then this expression has a maximum at cos @* = 
—m2/m,, and conclude that © has a maximum possible value Omax with 
sin @max = M2/m,. This is often derived geometrically from the figure be- 
low, where (a) illustrates the case that uc/vj = mi/mz < 1, while in (b) we 





have vc /v; = m1/m2 > 1, and the minimum value of © occurs for wy per- 
pendicular to wy. 

If m, >> m2, then ©max is close to 0; in fact, we then have Omax ¥ m2/my. 
All of which is a fancy way of saying that when a body of large mass hits a body 
of small mass, the body of large mass is hardly deflected. 


116 Chapter 3 


15. (a) Using the formulas for v,* in Problem 14, show that the momenta of 
the particles in the center of mass coordinates can be written as 


m1V, = LV 


M2V2 = —LV, 


where V = (Vj — V2) and p 1s the reduced mass, 

= mym? 

i my +m’ 
a quantity that frequently arises in two-particle problems (see page 136). 
(b) For a completely inelastic collision between two bodies of masses m, and 
mz approaching each other along a line, with velocities v1 and v2, the resultant 
common velocity is given by equation (1) on page 96. Compute that the loss of 
kinetic energy is 

hv, v= U1 — V2, 


(the kinetic energy of a body of mass u moving with the relative velocity v of 
_y_, the two particles). 





— 16. For Problem 1-22, use conservation of energy 
xm(x"? + y'?) + mgy = E 


and y =1—~V12 — x2 x x?/2I to deduce directly that for small oscillations we 
have x” + (g/l)x = 0. 


17. (a) The sum of conservative forces is conservative. (Trivial, but worth not- 
ing, in order to get simple examples of non-radially symmetric conservative 
forces!) 

(b) In particular, find a non-radially symmetric conservative force with a singu- 
larity at the origin, and nowhere else. 


18. Consider a force that is central, always pointing towards the origin, but not 
radially symmetric. Show that it cannot be conservative. 


"Q 9q [IM ULsIIO 9y) punose s19yds Aue SuUOCTe SULAOUI DUOP YOM sy JT, “yy 


19. Consider a function V: R* — R that is homogeneous of order k, meaning 
that 
V(ax) = a* V(x) xe€R?, aeR; 


note that each 0V/dx' is homogeneous of order k — 1. Let c be a particle 
moving under the force with potential function V, so that 


OV 


c(t) = a ee), 


Conservation Laws 117 


and consider the new path 
ci(t) = a-c(Bt) 


[we change the time by a factor of B and then the position by a factor of a]. 
(a) Show that 
= 2 oV 
ci (t) = —aB ai (c(Pt)) 
Bs 


2 
ee wy. 


~ kT Oxi 

Consequently, ¢ satisfies the same equation as c if 
B = wal E 

(b) In the case of a uniform gravitational field (like that near the earth’s surface) 
we have k = 1. So for any path c, the path ¢(t) = a - (t//a) is also a solution. 
This basically reproves the result of Problem 1-17 (a). 
(c) Generalize part (b) of that problem similarly. 
(d) Also use this general result to reprove Problem 2-1 (a). 


20. Find the “escape velocity” of a rocket on earth—the smallest initial velocity 
it must have so that it never falls back to earth (ignoring air resistance, the 
gravitational force of the sun, etc.), in terms of g and the earth’s radius Re. 
Notice that this does not depend on the rocket being fired directly upwards. 
(But rocket launches nearer the equator are better because the centrifugal force 
due to the earth’s rotation is greater there, so that the rocket actually starts with 
a greater velocity.) 


In contrast to Problem 1, the next problem shows that although it sometimes 
pays not to think too hard about a physics problem, at other times the difficulty 
is thinking about what the problem 1s coyly trying to say. 


21. (a) A small object travels in a circle on a frictionless table, held in place 
by a string that passes through a hole in the table. ‘The string is slowly pulled 





through the hole so that the radius of the circle changes from ro to 71. Show 
that the work done pulling the string equals the increase in kinetic energy of 


the object. 
(jetpea st yt uodn pajstoxs 9010} 94) Jey) pur) 
jeatds @ Jo peajsul SopoIID Jo sotias & UI SaAOUT D9[qo oy) Jey) puajoid :suvouT , ATMOIG,, 


118 Chapter 3 


(b) For a string being pulled through the hole with a constant force of magni- 
tude F (not “slowly”), find the velocity at time ¢ in terms of the initial velocity vo 
when the object is traveling in a circle of radius ro. 


22. (a) In Problem 1-15, the force must be F(t) = m'(t)v. Show that the total 
work done by this force from time 0 to ¢ will be 


ih vF(t)dt = [ v?m'(t)dt = v7m(t), 
0 0 


which is twice the kinetic energy of an object of mass m(t) moving with speed v. 


Why isn’t this a contradiction? 
69010} sty} Noge Aes sty) saop yeyM ‘Aem sJ9yIOUP 3 Ind OF, 


(b) Suppose a rope is lying on the floor, and it is pulled up from one end with 
constant speed. Compare the work done just after the whole rope has been 
pulled up with the total potential plus kinetic energy that the rope has. 


23. (a) Check that conservation of energy holds for the calculated solution to 
Problem 1-13 (a chain sliding off a table). 
(b) Similarly, use conservation of energy to derive this solution. 


24. For the folded chain problem of Addendum 3A, derive equations (B) on 
page 106 from conservation of energy. 


25. Suppose that we solved the original Cayley problem in the same way, by 
assuming conservation of energy. Using the formula for AE on page 102, show 
that we would get 


az az 
«P= e(x-S), x" =£ (145), 


which still has the disconcerting continuity when the chain leaves the table. 


26. A ball of mass M having a coefficient of restitution nearly equal to 1 in 
collisions with the floor, is dropped on the floor with a ball of mass m < M 
sitting on top of it. What happens? (To avoid the sorts of difficulties encountered 
in Addendum 3B, assume that the two balls are actually separated by a small 


Om 


Conservation Laws 119 


distance e.) The Wham-O® SuperBall® (for which Wikipedia has an interesting 
entry) can be used for an instructive experiment, with something like a marble 


for the top ball. 





og 7. A space ship of mass m, launched far away form a planet of mass M and 
radius R, has been traveling with constant speed vo along a path parallel to the 
line € through the center of the planet, at distance d from this line. It is now 


beginning to approach the planet, and we want to know how large d can be so 
that it will hit the planet. If r is the distance from the space ship to the planet, 
then the total energy of the space ship is 


1 , mMG 


U 
2 ro 





where r is the distance from the center of the planet to the space ship (and G 
is the constant in the law of gravity [page 37]). 


(a) Far from the planet, the angular momentum of the rocket ship with respect 
to the center of the planet has magnitude mdvg, and the total energy has value 


1 2 
4INVO ; 


(b) Suppose that the ship just grazes the planet, with speed w. Show that the 
angular momentum at that time has magnitude mRw, and the total energy has 


the value 
1 , mMG 


~mw 
Z R 


Explain why we can conclude that 


2MG 
d* = R*{14+ —— 
( a ee) 


CHAPTER 4 


THE ONE-BODY 
AND TWO-BODY PROBLEMS 


fter the bruising engagement with real-world problems of the previous 

chapter, we beat a hasty temporary retreat to the safer realm of more 
purely mathematical questions, restricting our consideration of “conservation 
of energy” to its most basic form for mechanics—conservation of kinetic plus 
potential energy. ‘This chapter is a sort of companion to Chapter 2, giving a 
connected modern treatment of the material covered there, together with fur- 
ther developments, most of which Newton also treated in the Principia. 


The one-body problem. Consider a particle 
c(t) = r(t) - (cos @(t), sin O(t)) 


moving under a radially symmetric central force, so that 
c 
mv’ = —(f or)--; 
r 


as in Chapter 3, we add the — sign so that a positive f corresponds to an at- 
tractive force. A planet moving around the sun is the prototype of this problem, 
which is usually called the “one-body problem” because we are ignoring the 
force that the planet exerts on the sun, and thus really only considering a single 
object under the influence of some force that we do not specifically attribute to 
another particle. 

As we noted in the previous chapter, our force is conservative, with a potential 
energy function V. Writing v in terms of r and 6, the conservation of energy 
formula 


imv* +V=E 
becomes 
(1) sm(r’? +7707) 4+ Voc=E 
We also have (page 56) 
(2) r?6’=h for a constant h. 


120 


The One- and Two-body Problems 121 


Squaring (2), and substituting into (1), we obtain 


2 h? 2 


(A) poe Oe) as or r= 2 (For) —" 


72? 
where the second equation, previously obtained on page 75, has the potential 
function V oc written in terms of F with F’ = f, which 1s only defined up to 
a constant; in the first equation the constant is written as EL, which amounts to 
a choice of the constant in V. 

Note that in the second equation of (A) the mass m doesn’t really play a role, 
because we are going to be considering forces where f 1s proportional to m. 
Similarly, m can be ignored in the first equation if we replace E by E=E/m, 
the total “energy per unit mass”; in the same vein, the angular momentum is 
mh, so h is just the angular velocity, or “angular momentum per unit mass”. 
Often it is convenient simply to assume that m = 1. 

Taking the derivative of (A), and dividing by r’ we obtain 

For... he 


B ee + —. 
(B) f oat aa 





As on page 101, we technically need to be more careful, especially for a circular 
orbit, with r constant. Note, by the way, that an attractive force f > 0 always 
has circular orbits for any radius p, 


c(t) = p(cosat,sinat), 


since we Just need to choose a so that f(p) is m times the magnitude of the 
acceleration, or {(p) = mpa’; in terms of h = p*a, we need 


(Bp) mh? = p°f(p), 


which is what (B) then reduces to. 

The solutions r of (B), together with (2), giving 6, theoretically provides all 
orbits, complete with parameterization, but to determine only the shape of 
the orbit we can combine (2) and (A) to find the derivative of ro @7!. As in 
Problem 2-6, calculations are simplified by using Leibnizian notation. We have 


dr _ dr dé 
do adat/ at’ 


and obtain [with the obvious interpretation of V(r)] 


dr r- 12 h2 dr r2 p) h2 
4 (BS Va) Ss cairn A Oeil 2 eee oe 
i ua Oa oF Lea 


122 Chapter 4 


and then 


~ {+ ‘ »- fe 
} 2 h? } 2 h 
i we 0 (oar pr? Se cos ay 


theoretically allowing us to express @ in terms of r, and thus r in terms of 0. 
Although Newton’s first derivation of the orbits for an inverse square law was 
the geometric one given in Chapter 2, later in the Principia Newton also gave 
essentially these same equations, though stated entirely in geometric terms—see 


Addendum A. 


It is usually more convenient to write our equations in terms of uw = 1/r, with 


1 [2 Fes ae 
5 \/—(E-V(1/u)) hu? or apy Fw) hw, 


and thus 


(D) g=- | Bi ga- | te 
26 Vay) {2 FU /w)— hu? 
m m 

Squaring (C) gives 


du\* 2 du\? ~2F(1 
(E) (5) +u? = jim (Ee —V(1/u)) or (55) +y? = mar hls 


and taking the derivative of (E) and dividing by du/d@ yields the equation 


d*u fUl/u) 
(F) de2 |  mh2u2’ 





These equations all involve the usual deviousness of Leibnizian notation; for 
example, the term u in (F) actually means “uw as a function of 6” (i.e., uo 07"). 
This facilitates a lot of manipulations of equations, but on at least one occasion 
(cf: page 130) it will necessitate a bit of extra care. 


Although the case of an inverse square force is of most interest in terms of 
gravitation, other radially symmetric forces are important in physics, and we 
can get a good idea of the general nature of orbits under any such force. ‘There 
are only a few examples where the equations for n™ power forces can be solved 


The One- and Two-body Problems 123 


in terms of elementary functions, but they provide a good introduction to the 
general nature of orbits. 


]. First we have the case f(r) = mKr, for a constant K > 0, treated previously 
in Problem 2-4. For the somewhat cumbersome solution in terms of our general 
equations, we have V(r) = mKr?/2, so equation (D) becomes 


_i 
2E 8K eal 


and the substitution u = Jv gives 


2 Ady 
1 ff(2E K  ,\* 
0-3] (P-E-*) dv 


1 | du 
2 / A2 —(v — B)? 
for = = 
E K E 
De 5s ae 
yas a. and B= -—,. 


v—B 
26 = —_———- 
arccos ( a - 


where we have replaced the more precise 6 + 69 simply by @, as we will continue 
to do for the other examples, since this just amounts to a rotation of our axes. 
We can write this as 


~~ 


‘Thus we have 


J 
(a) 2 = B+ Acos20, 


which is an ellipse centered at the origin (Problem 2). 


2. For the case f(r) = mK/r?, for gravitational attraction, it is easiest to use 
equation (F), which simply becomes 


with the solution u = 72 + (constant) -cos #, which it will be convenient to write 


K ~ 
ea) + KAcos@. We can assume that A > 0, since this just amounts to 


124 Chapter 4 


replacing 6 with @ + 2, and we can write our solution as 
h*/K 
b) pa 
1+h?Acosé 


which is a conic section with the ongin as focus, and eccentricity ¢ = h?A (see 
Problem 3 for a review). 


Substituting the solution u = 72 + KAcos@ into (E) and simplifying, we ob- 
tain 
1  2E 
a 
A aT ER? 
so we have 


(by) €=V14+2h°E/K?. 


This shows that 
an ellipse if FE <0, 
the orbit is (a parabola if E =0, 
ahyperbola if EF > 0. 


In the figure below, for a particle of mass m = 1| with V(r) = —1/r, on the 
circular orbit of radius 1 we have V = —1. Equation (B,) gives h* = 1, so 
6’ = | and the kinetic energy is ‘, so that E = —5, which we could also have 


obtained from (bj), since the circle has eccentricity ¢ = 0. As we increase the 


initial velocity at P, with h increasing from 1 to V2, we get ellipses with energy 
approaching 0, and thus a parabola, with eccentricity 1; and for larger initial 
velocities we get hyperbolas. 


+ P = (0,1) 





The One- and Two-body Problems 125 


When the conic section (b) is an ellipse, the semiaxes a and b are given 


(Problem 3) by 





2 
K 
(b2) aa tls and b=avl-—eé?, 


while (b;) can be written 1 — e¢? = —2h* E/ K”, so that we have 


K 


(b3) a= Ts 


Consequently, the total area of the ellipse is 
hK 
(bq) nab = na*V1—62 = seis 
V (-2E)° 
Since the area of the graph of the function r(@) between 09 and 6) 1s given 


6 , ; 
by 4 f, ne r? d6, the area of the ellipse swept out from time fo to fy is 


5 | | r(t)?0'(t) dt = aa — to). 


So the “period” of the orbit, the time t for an orbit to be completely covered 
once, 1s given by 


2nK 
Vv (-2E) 


depending only on the energy E of the orbit. Moreover, (b3) also allows this to 


be written as 
3 
la 
T= 2 ra 


depending only on the length of the semimajor axis. 
In the case of a force of magnitude GMm/r?, with M being the mass of the 
sun, and G the “universal constant” in the law of gravitation, we have 


by (ba), 


2 
t = — -area of ellipse = 


h 


a? 21nGM 
T = 2n,/—— or 


GM Py (—2E)? , 
From the first of these we have Kepler’s Third Law: ‘The squares of the periods 
of the plancts are proportional to the cubes of the major axes of their orbits; 


126 Chapter 4 


conversely, the observational evidence for Kepler’s ‘Uhird Law shows that the 
forces on the various planets must involve the same constant G for all of them. 
Hyperbolic orbits are of some importance in studying particles around the 
sun that come from or escape to outer space, but are most interesting in regard 
to Rutherford’s early investigations of the structure of the atom, as described in 
Addendum GC, where the appropriate formulas for such orbits are presented. 


3. The case f(r) = mK/r? can also be solved explicitly. It is convenient to 
start directly with (A), which becomes 


; I 
(c) pies eaten. 


For C = 0, we must have h? < K , and we can divide by 0’ = h/r? to get 


dr VK — h2 


rT iia ak. 
oiving 
r=a-e fory = VK —h?/h. 
For K = h*, or y = 0, we get a circle. Otherwise, we obtain a logarithmic spiral, 
also called an equzangular spiral (Problem 6), which spirals around infinitely often 


as it approaches the origin, as well as when it approaches infinity, with r growing 
monotonically in each direction. 


For the general case of our equation (c) with C 4 0, we first have to consider 
yet another special case, namely K = h*. Then r’ = C and we have 


dr/d0 = (dr/dt) /(d0/dt) =C/0' = = 72, 


and thus 
_ hdr 9 hi 


dé 


“op §=-eP 
The equation r = constant/@ is a hyperbolic spiral, also known as a reciprocal 


spiral (Problem 7). 


The One- and Two-body Problems 127 


On the other hand, if K 4 h?, there will be ro with r’(ro) = 0 (a point either 
at minimum or maximum distance from the origin). For K > h?, we write (c) 
as 


1 1 
r’? = (K —h?) (s-53) (r < 10). 
Dividing by 6’ = h/r? we obtain 


dé hro r 


dr J/K_h2 [p92 — 2 


with the solution 


gd = ———cosh  —, 


VK —h? E 
spiraling around the origin as it approaches it. Part (a) of the figure below shows 
the orbit from ¢t = 0 to t = o, while (b) adds the orbit from t = —oo tot = 0. 





(a) 
These curves are sometimes known as Cotes’ spirals (Problem 8). 
If K < h*, we obtain instead 


do a hro ] 
dr JS/h2 — K ry/r2 = 192 


with a solution that goes off to infinity for @ in a finite interval, 


Yo 
g = —————- arccos —. 
r 


h?—K 





Newton’s investigation of inverse cube forces in the Principia 1s split into two 
parts. Early on, Newton determines geometrically that an inverse cube force is 
required to produce an equiangular spiral as an orbit, just before determining 


128 Chapter 4 


geometrically that a force proportional to distance is needed to produce an el- 
liptical orbit centered at the origin (this is the result referred to in Problem 2-4). 
The general case of an inverse cube force was handled only later, after he had 
the equivalent of the equations in this chapter. Oddly enough, although Newton 
used those results to investigate inverse cube forces in general, he never both- 
ered to use the general results to redo the case of the inverse square force—see 
Chandrasekhar [2; pp. 172-180] for an illuminating discussion. 

The case f(r) = 1/r (where F = logr isn’t even a power function) cannot 
be solved explicitly, and it usually isn’t even considered in physics texts, because 
forces of this sort normally arise from a line source, rather than a point source. 
But a graph of the solution gives a good illustration of the general nature of 
orbits. The orbit, shown in (a), can be constructed, as shown in (b), from a 
single piece that 1s reflected over and over again. ‘This basic piece goes between 





(a) (c) 


two apsides, an apsis being a point of maximum or minimum distance from 
the center—a point at minimum distance is a periapsis or pertcenter, a point at 
maximum distance an apoapsis or apocenter.' The angle shown in (c) between 
two apsides is called the apszdal angle. 

For the case f(r) = mr, the apsides are the ends of the two axes, with apsidal 
angle 2/2; for the case f(r) = m/r*, hyperbolic and parabolic orbits have only 
a pericenter, while an ellipse with focus at the origin has the two ends of the 
major axis as the apsides, with apsidal angle z; for the case f(r) = m/r?, our 
first two solutions have no apsides, our third solution has only an apocenter, 
and the fourth has only a pericenter. Many of these features are often discussed 
in terms of a “reduction to a one-dimensional problem”, as in Addendum B. 

For the general case, note that equation (F’) has the form 

2 
a = glu) 
and we have du/d@ = 0 at an apsis point 0 = 6. Changing 6 + @ to 0 — 6 
produces the same equation, with the same initial condition du/d@ = 0 at 6, 


|The terms perigee and apogee are used for the moon revolving around the earth, pert- 
helion and aphelion for the planets around the sun, and a flock of analogous terms are 
used for other astronomical bodies. 


The One- and Two-body Problems 129 


which shows that the orbit is symmetric with respect to the line drawn from the 
origin to an apsis, so every orbit with two apsides can be obtained by this general 
method of reflecting a basic piece. The basic portion of the orbit need not be 
concave with respect to the origin, so a more complicated figure is needed to 





give some idea of the general orbit. On the other hand, the figure below shows 
an orbit that is concave with respect to the origin, for the force f(r) = r7~7"!, 
looking vaguely like an ellipse revolving around its focus, together with the orbit 
for the force f(r) = r7!-?, revolving in the other direction. 


‘The motion of bodies in mobile orbits, and the motion of the apsides”. ‘That is 
the title of the section that Newton presents after deriving the basic equations for 
central force motion and applying it to the inverse cube law. Consider a particle 
moving in an orbit around the origin of a central force, and now suppose that 
the orbit itself is revolved around that origin in some way. If our original orbit 1s 


c(t) = r(t) - (cos A(t), sin O(t)), 
with constant = r*6’, our “revolving orbit” can be described as 
(t) = r(t) - (cos A(t), sin O(t)) 


for some function 6; for simplicity we assume that 6(0) = 0(0). In order for 
r70’ also to be constant, so that the revolving orbit is also due to a central force, 
we must have 6’ = a@@’ for some constant a, and thus 6 = @@, so that we can 
write 


(t) = r(t) - (cos O(t), sin 6(t)) A(t) = aA(t), 


where now r26/ = h = ah. 


130 Chapter 4 


From equation (F) for the original orbit c, 


du _ s/w) 
(Fe) de? om ~ mh2u2 





we can derive a corresponding equation for the revolving orbit ¢, and thus 
determine the force f needed to produce this orbit. We have! 


1 du du 1 d7u 


du 
(a) ae ———(a¢) and then age 2 AB? 


a dé 7) 


In the desired equation for ¢ 





d?u f (1/u) 
—.—- + u= = 
d@2 mh2u2 


the term u really stands for “u as a function of 0”, so that the equation means 


f (1/u(ag)) 


eX + u(agd) = 
d 62 ~ mh2u2(ad) 


Substituting from the second equation in (a), and multiplying by a*, we obtain 


d7u — f(l/u) 
Pe) doz" ae mh?2u2 ’ 
where the common argument a@ has now be dropped without ambiguity. Sub- 
stituting from (F.) then gives 





f (1/u) — mh*u? (0% + — — “] 
2 a 
= fj) + MEY 


so that the force for a revolving version of an orbit differs from the force for 
the orbit itself by an inverse cube force. As usual, the m on the right side 1s 
essentially irrelevant, since f and f will normally be taken proportional to m. 
This result is known, or at any rate was once known, as Newton’s theorem of 
revolving orbits. 


! To obtain (a) we can write du/d 6 = (du /d@) j (d A /d@), or if necessary explicitly write 
du/d@ = (uo 67)’, etc. 


The One- and Two-body Problems 131 


Revolving orbits were introduced in the Principia in order to investigate or- 
bits that aren’t precisely elliptical, but whose apsides rotate in time, the prime 
example for Newton being the orbit of the moon, which is close to circular, 
just like all the planetary orbits. Newton used the formula just derived to ap- 
proximate an almost circular orbit under any central force by a revolving orbit, 
starting with an elliptical orbit under an inverse square force, and then adding 
an appropriate inverse cube force to get a revolving elliptical orbit close to the 
given orbit. 

To solve this problem, Newton works backwards, finding a revolving elliptical 
orbit for which the requisite force would be proportional to the given central 
force. So, fora given central force f and a particle, assumed to have massm = | 
for simplicity, moving under this force in an orbit that is close to a circle of 
radius p we first choose an ellipse having the center of the force as a focus, 
with the end of the major axis at a point on the orbit at distance p from the 
center. ‘This is the orbit under an inverse square force, and for simplicity we 
will choose our units so that it is an orbit for the force g(r) = 1/r*; equation (b) 
on page 124 then shows that p = h?. 





Now consider the revolving orbit we obtain from this ellipse for some a, so 
that it is the orbit under the force 
1 h*@?*-1) 1 it p(a* — 1) 


EU) > at x 5 


r r? 


Letting k be the ratio of f(p) to g(p), 


N 


: 67 
I(p) = k - 8(p) me 

we are now going to choose @ so that f = k - g up to first order, 1.e., so that we 

also have 


fp) =k 8'(p). 


‘This means that we want 
b=3e- 
iO, 
p 





132 Chapter 4 


hence 
(+) ‘ye = F(p) 
3f(p) + pf'(p) 


Newton introduced this calculation with the explanation that “Orbits will 
acquire the same shape if the centripetal forces with which those orbits are 
described, when compared with each other, are made proportional at equal 
heights.” I.e., given two central forces fj and f2 with fi; = k- fo, if c 1s an orbit 
for fo, then y(t) = c(Vk -t), with only a multiplicative change of parameter, 
will be an orbit for fj. Since f = k- g up to first order, Newton concludes that 


An orbit close to the circular orbit of radius p will be close to the revolving 
orbit obtained for this choice of ellipse and a. 


This actually requires continuity with respect to the defining equation, the subject 
of Problem 12, rather than continuity with respect to initial conditions, which 
is what differential equation texts customarily prove. 

Moreover, since the ellipse has the apsidal angle m and thus the apsidal angle 
of the rotating orbit is wa, Newton further concludes that 


The apsidal angle of our given orbit must be close to ma for this a. 


This is the statement most often found in modern texts, with an argument not 
relying on revolving orbits at all. We begin with equation (B) 


h2 
r pee da Sa 


together with equation (B,), that for the circular orbit of radius p we have 


h? = p° f(p). 
Writing r(t) = p + x(t) for a small “perturbation” x(t), we get the equation 
(P) x"(t) + g(x(t)) = 0, 
where 3 Fp) 
= ape A 

gy) = flo t+ y) ope 

with 
3 / 

(Po) g(0)=0, gO) = aaa 


If g is linear, g(y) = g’(0)y, then the solutions of (P) with x(0) = 0 are 


just multiples of x(t) = sin(yg’(0) -t), with a semiperiod of 2 //g’(0) (we’re 
assuming g’(0) > 0; the case g’(0) < 0 will be discussed later). So it would 


The One- and Two-body Problems 133 


seem that the semiperiod o of a small solution x of (P) ought to satisfy 


(>) aoxa//g'(0) = 1//Bf(p) + pf'(p))/p. 


Taking into account the fact that the radial speed 0’ satisfies 


p? p 
the approximation (**) for o is consistent with equation (*) for a. To com- 


plete the argument rigorously we need the following result, which 1s hardly ever 
explicitly stated, let alone proved. 


PERIOD LEMMA. For small x satisfying 
x"(t) + g(x(t)) =0 g(0)=0, g(0)>0 


the semiperiod of x various continuously with x [i.e., varies continuously with 


x'(0) for solutions with x(0) = 0], and approaches z/,/g’(0) as x approaches 0. 


PROOF! Without loss of generality we can assume that g’(0) = 1, by consid- 
ering x(V g/(0) t). 

Let G be the function G(x) = Io g, 1.e., the function with G’ = g and 
G(0) = 0, and define the function 7 by n(t) = (sgnt)- /2G(t). Then 7 1s 
differentiable at 0; in fact, for the left- and right-hand derivatives at 0 we have 
tim 8 = 90) =1, 


_ 2G(h 
ini.(0)? = tim = 


1 is differentiable on some 


and we can easily conclude that n’(0) = 1. Thus n7 
interval around 0, with (n~')/(0) = 1. 

For a solution x, suppose that x’(t9) = 0 so that x (fo) 1s a relative maximum 
or minimum point, and let t; > to be the next point where x’(t;) = 0, so that 
x(t) 1s the next relative maximum or minimum point, and the semiperiod 1s 
thus t; — fo. 

For any solution x, the “energy” 3(x’)* + Gox has derivative 0, so 


S(x')* + Gox = Ex 


I This proof comes from C. Chicone and M. Jacobs [1]. According to Prof. Chicone, 


the result, being already known, was merely given for completeness. 


134 Chapter 4 


for a constant FE, depending continuously on x, with Eo = 0. Since 


x’ = J2/Ex —Gox, 


the substitution dx = x’ dt gives 


ty x(t1) dx 
semiperiod x = dt = | 
x 


to (to) VIS Ee Gox 


Now we use the substitution X = nox = +V2Go x [essentially replacing the 
solution x with a sine curve], with 


g& 
dX = —————_ dx 
+tV72G 0X 
X dX 
dx = + ——. 
g& 


The limits of integration become +./2G(x(t;)), and since x’(t;) = 0, we have 
G(x(tj)) = Ex, so we obtain 


X dX 


semiperiod x = if 
V2 Id yeee gl (X)) V Ex = X?/2 


For the limit as x — 0, we use the substitution X = /2£E;, sin @ [essentially 
“blowing up” the singular solution u = 0] to express the semiperiod of x as 


[- JV2Ex siné 10 = [- 1 10 
an g(n7}(/2E, sin @)) =) n’(n-1(/2E, sin 6)) 


w/2 
= | (n~1)'(./2Ex sin 0) dé. 


w/2 
Since Eg = 0, the semiperiod of x thus approaches 
m/2 /2 
| (n~*)'(0) dd = | Idd =n. % 
—1/2 —x/2 


As a very simple example of these results, suppose that we have the constant 
force f(r) = 1. Then (x) gives a = 1/3, so the revolving orbit must have 


The One- and Two-body Problems 135 


an apsidal angle of 1/3. Consequently, a nearly circular orbit (a) under the 
constant force f(r) = 1 must have an apsidal angle close to this value. It 


pericenter 






/ /3 radians 


ay 103°9° apocenter 


(a) (b) 


appears from some graphing experiments that the apsidal angle 1s fairly close 
even for orbits (b) not so close to circular. 
More generally, for any power force f(r) = r” the formula for a is indepen- 
dent of p, 
, I 
— 3 4+n 





This answer makes no sense for n < —3, and, correspondingly, the value g’(0) 
in (Po), 
g'(0) = (3+n)r" 


is no longer positive. ‘This is not surprising in light of our analysis of inverse cube 
forces: We found that all orbits were either circles or curves that spiraled into 
the origin or escaped to infinity; in other words, the only orbits close to circles 
are circles, or to put it another way, circular orbits are not stable for f(r) = r73, 
and this 1s in fact true for f(r) = r” whenever n < —3 (Problem 13). 

By contrast, consider a force close to an inverse square force, f(r) =r 
for small ¢e. Then the apsidal angle of a nearly circular orbit, measured in 


degrees, must be close to 
180 


l-e 


—(2+¢6) 





= 180(1—«)7!/? = 180(1 + Se +--+) & 180 + 90e. 


So if ¢ = 107%, the apocenter would advance by about .09°, which is more 
than 5’ [where “ denotes a minute, or 1/60 of a degree], easily measurable by 
astronomers. At the beginning of Book 3 of the Principia, Newton refers to 
these considerations to point out that observations show that the inverse square 
law must be true “with the greatest exactness. ... For the slightest departure 
from the ratio of the square would ... necessarily result in a noticeable motion 
of the apsides in a single revolution and an immense such motion in many 
revolutions.” 


136 Chapter 4 


The two-body problem. Newton easily disposed of the two-body problem, in- 
volving two particles cy and cz with masses m, and mz, each acting on the other 
by a radially symmetric central force: 


Two bodies that attract each other describe similar figures about their common 
center of gravity and also about each other. 


It might be entertaining to read Newton’s explanation, Newton [2; pg. 561], 
but we will simply resort to a few formulas. Instead of writing the magnitude 
of the force that c, exerts on cz as —mz f, let us now simply write the equations 
for this force, and the opposite force that cz exerts on c;, in terms of the unit 
vector u = (C2 — C;)/|c2z — c4| as 


myc = —f(e1 —c2\)-u 


—f(ler — €2|)-—u = +f (ler — c2|) +, 


where f is assumed symmetric in m, and m2 (normally simply involving the 
factor m\mz2). We immediately obtain 
] m, +m? 


1 
(cy —c2)" = (—- + — | -—f(|e1 —c2|)-a = —————_ f([e1 — 2) -, 
my m2 m 1m 


mM2Co”" 


which shows that c; — cz (the motion of c, about cz in Newton’s statement) 
is given by an equation of the exact same form, for a particle with the reduced 
mass [L = mym2/(m, + m2)—see Problem 10 for a specific example of how this 
works out. 

Moreover, taking the center of mass C = (mic, + m2Cc2)/(m1 + mz) as the 
origin of an inertial system, we have 


m2 
cy —C = ——(c] —¢2), 
m;, +my2 
My 
CoS C= ——————(c — €2), 
m, +m? 
so yj = ci —C satisfy 
m, +m? m, +m 
——__—y, = — ———_p2 = (€1 — €2), 
M2 my 


and thus the path of each particle with respect to the center of mass is a similar 
orbit. Note that the two paths will generally lie in different planes through C. 

In the case of the moon and the earth, the center of mass lies a little below 
the earth’s surface on the side facing the moon, and the earth is rotating about 
this point as the moon rotates about this same center of mass; for the period of 
the earth’s rotation about the center of mass see Problem 10 (d). 


The One- and Two-body Problems 137 


“The attractive forces of spherical bodies”. ‘This is the title of the section of the 
Principia that follows this initial analysis of the two-body problem. Although 
that analysis addresses one problem with the purely theoretical analysis for the 
one-body problem, it still requires that we consider only particles, bodies which 
can essentially be regarded as point masses. But the attraction of an object 
toward the earth, for example, is the result of its attraction toward myriad par- 
ticles within the earth, not only at varying distances, but of varying density. In 
fact, the section of the Principia that treats the two-body problem ends with 
the words “Let us see, therefore, what the forces are by which spherical bodies, 
consisting of particles that attract in the way already set forth, must act upon 
one another, and what sorts of motions result from such forces.” 

In essence, Newton showed that the inverse square force attraction for par- 
ticles holds just as well for spherical bodies. Moreover, these bodies need not 
have uniform density; it 1s only necessary that their densities are spherically 
symmetrical around their centers, which is a good rough approximation even 
for complicated bodies like the earth. ‘These results were apparently a pleasant 
surprise for Newton, who at first suspected that for spherical bodies there would 
only be a close approximation to an inverse square force at large separations. 

Newton begins by considering a (2-dimensional) sphere whose mass m is uni- 
formly distributed over its surface, and a particle P that is attracted toward the 
various points of the sphere by a force inversely proportional to the square of 
its distance from P to that point. Newton first proves that the total force on P 
is 0 when P is inside the sphere, and his geometric proof is both so simple and 
so alluring that it is was once the proof of choice. 

Recall that for the two intersecting segments of a circle shown below, triangles 


A A 
Ma 


APC and BPD are similar, for ZA = ZD since they subtend the same arc CB. 
Now consider a point P inside a sphere, and draw AB and CD through P 
intersecting this sphere in very small arcs AC and D. Since triangles PBD and 


Ze 


C7 
A 


138 Chapter 4 


PAC are similar, we have 


BD AC BD AC’ 


PB PA PB- PA? 





When we rotate our lines around the angle bisector of Z P, we get a correspond- 
ing 3-dimensional picture in which BD is close to the area of the portion of the 
sphere that is cut off on one side, while 1/ PB is the factor by which points in 


this portion attract P; and AC” j PA’ similarly represents the force by which P 
is attracted in the opposite direction. So these forces cancel out, and, in New- 
ton’s words “by a similar argument, all the attractions throughout the whole 
spherical surface are annulled by opposite attractions.” 

Newton next considers the case where P lies outside the sphere, as in (a) of 
the figure below, and shows that it acts as if it were attracted by a particle at 
the center of the sphere with the total mass m of the sphere. Newton analy- 
ses this case with an even more ingenious geometric argument that textbooks 





(a) (b) 


never use (see Chandrasekhar [2; pp. 270-273] for Newton’s proof and fur- 
ther discussion), usually resorting instead to a fairly straightforward integration, 
as in Problem 16. From there it 1s straightforward to extend the result to the 
case (b) of the gravitational force exerted by any 3-ball whose density various 
only with the distance from the center. But a more elegant treatment is available 
using the Divergence Theorem, a.k.a. Gauss’ ‘Theorem, Ostrogradsky’s ‘Theo- 
rem, Green’s ‘Theorem, which is more frequently mentioned in connection with 
electric fields. 

Recall that for the vector field X = )~?_, a'd/dx' (where (x!,...,x”) is the 
standard coordinate system on R”), the divergence of X is defined by 


* da! 
div X = : 
iV 2 ai 


and for a compact n-dimensional manifold-with-boundary B C R”, with out- 





The One- and Two-body Problems 139 


ward pointing unit normal vector v on 0B, we have the Divergence Theorem! 


| div X dV, = | (X,v) dVn-1, 
B 0B 
where dV, is the n-dimensional volume element on B, and dV,,_, is the (n—1)- 
dimensional volume element on OB. 

We compute that div X = 0 for the vector field X(p) = p/|p|” in R”, and 
in particular, div X = 0 for the vector field X in R? 


Pp 


which is just a radial vector field whose length is inversely proportional to 1/|p|?. 
So if B C R? contains a sphere S around the origin, applying the divergence 
theorem to B — (interior of S), and noting that on S the “outward pointing 





normal” v is actually inward pointing, we obtain 


[ (Xda =— f(x) dd =r, 


S 


where the same dA is being used to denote the 2-dimensional volume element 
on both 0B and the sphere S. 
Applying this to the vector field 


Pp 
X(p) = re 


giving the gravitational force exerted by a particle of mass m at the origin, we 
see that the flux | (X,v)dA of X through 0B is —42%Gm. In physics texts 
0B 


this special case is often called Gauss’ Law, and an elementary proof 1s often 
provided—see Problem 19. ‘The same result holds for the field produced by any 
collection of points surrounded by 0B, and even for a 3-dimensional collection 
of points surrounded by 0B, where we specify a density rather than individual 
masses: the flux through 0B is always —42 GM, where M is the total mass. 


' See, e.g., DG, Vol. 1. 


140 Chapter 4 


Now consider a 3-ball whose density various only with the distance from the 
center, and a sphere S = 0B of radius R, with the same center, surrounding it. 





By symmetry, the gravitational field X produced by the ball must always point 
toward the origin and have the same magnitude yp at all points of S. So 


—41GM = / —udA = —4rR7p, 
S 


and thus the gravitational force of the ball on a particle of mass m at distance R 
from the center of the ball must have magnitude 
GmM 
R2 — 


Finally, we leave it to the reader to conclude first, that for a particle of mass m, 
and a radially symmetric 3-ball of mass M, the total force of the particle on the 





ball (the sum of the forces of the particle on each of the particles of the ball) 
also has magnitude GmM/R?, where R is the distance from the particle to the 
center of the ball; and second, that for any two such balls, of masses M, and M2, 
the total force of each on the other has magnitude GM, M2/ R*, where R is the 
distance between their centers. 


A La Principia 141 


ADDENDUM 4A 
A LA PRINCIPIA 


Although Newton never stated “conservation of energy” or even gave a name 
to the quantity we call kinetic energy, he essentially recognized it in a pair of 
Propositions. ‘he second of these corresponds to the calculation on page 88: 


Proposition 40. Lf a body, under the action of any centripetal force, moves in 
any way whatever and another body ascends straight up or descends straight 
down, and if thewr velocities are equal in some one instance in which their 
distances from the center are equal, their velocities will be equal at all equal 
distances from the center. 


As usual, the proof (Newton [2; pg. 528]), with a geometric diagram, contains 
A 


V 


C 


no equations and almost no symbols, but one can see that it is equivalent to 
the few lines on page 88 (in comparing the force DE on the body descending 
straight down with the force IN on the body moving on the path VITKk, 
Newton decomposes IN as I'T’ plus ‘TN, where ‘T'N is perpendicular to the path, 
and notes that it doesn’t affect the motion, corresponding to the next-to-last 
step of the equation on pages 88). 


Newton also points out the result that we have discussed on pages 92—93: 


Coro.iary |. Hence if a body either oscillates while hanging by 
a thread or is compelled by any very smooth and perfectly slippery 
impediment to move in a curved line, and another body ascends 
straight up or descends straight down, and their velocities are equal 
at any identical height, their velocities at any other equal heights will 
be equal. For the thread of the pendent body or the impediment of 
an absolutely slippery vessel produces the same effect as the transverse 
force N'T. ‘The body is neither retarded nor accelerated by these, but 
only compelled to depart from a rectilinear course. 


142 Chapter 4. Addendum 4A 


Moreover, Newton had essentially computed the potential energy function V 
for an arbitrary radially symmetric central force in his previous 


Proposition 39. Suppose a centripetal force of any kind, and grant the 
guadratures of curvilinear figures; wt 1s required to find, for a body ascend- 
ing straight up or descending straight down, the velocity in any of its positions 


Here “grant the quadratures of curvilinear figures” means that the answer is 
allowed to be expressed in terms of integrals, and Newton’s answer amounts to 
the term —(F or) on page 89. 

Newton doesn’t actually write down an integral, of course. His proof involves 
another complicated geometric diagram, of which we reproduce only a part, 
showing two positions D and E of the falling body, with the length of DF being 
f(D), and similarly for EG, and the curved line being the locus of all such 


A B T 
D F 
E G 
GQ 
-§ 
Cc oS 
> 
© 
Lamp) 
SH 


points. In other words, this 1s basically just the graph of f turned on its side, 
and Newton gives his answer in terms of the area under (1.e., to the left of) this 


graph. Details may be found in Chandrasekhar [2; pp. 161-163]. 
And finally we have 
Proposition 41. Supposing a centripetal force of any kind and granting the 


quadratures of curvilinear figures, wt 1s required to find the trajectones n which 
bodies will move and also the tumes of their motions in the trajectories so found. 


As discussed in Cohen and Whitman [1]; pp. 141-142], Newton’s contempo- 
raries generally failed to appreciate the significance of this Proposition, hardly 


A La Principia 143 


surprising, since the proof comes with a terrifying diagram—basically a combi- 
nation of the diagrams for Proposition 39 and Proposition 40—and the demon- 





stration 1s given totally in terms of complicated geometric constructions. ‘These 
correspond, step-by-step, to a modern proof using integrals; a detailed account 
is given in Chandrasekhar [2; pp. 168-171] or Cohen and Whitman [1; pp. 334- 
345]. 


144 Chapter 4 


ADDENDUM 4B 


REDUCTION TO A 
ONE-DIMENSIONAL PROBLEM 


Equation (B) on page 121 can be written (taking m = 1 for simplicity) 


2 2 
wn ovr) Pn 0 Vir) + h 
2r2 


or rsa 


0 wa 
ee ceeretery | 4 
or (r) 





for the “effective potential energy” V(r) = V(r) + h?/2r?, which is just the 
equation for a one-dimensional problem with potential energy V. [Moreover, 
the energy FE for this problem is 
E=1r?+4V(r) 
2 


h 
eee Ore 
= 5° + 5 + VY) 


5 (r’? + r*9'7) + Vir) 


1 
2 


by equation (2) on page 120, which is the same as the energy E for the original 
problem, by equation (1) on that page.] 

Many aspects of the general nature of orbits can be interpreted quite simply 
in terms of this one dimensional problem, by considering the graph of V, as in 
the case of an inverse square law, shown below, with the graph of V decreasing 
from oo to a minimum, and then increasing asymptotically to 0. Since kinetic 





Reduction to a One-Dimensional Problem 145 


energy is non-negative, we must have E => V,so if E > 0, then r must lie in 
an interval [r1, 00) for some 71, so the particle comes in from infinity to r = r1, 
and then moves back out to infinity (hyperbolic and parabolic orbits). On the 





other hand, if E < 0, then V will eventually exceed E, so r will have to be 
in some interval [rj, 72] (elliptical orbits). In this case the function r oscillates 


E <0 





between the values r; and r2, the sort of motion considered in Chapter 8, see 
Problem 8-4. If the value of E is the minimum possible value of V, then r can 
have only a single value (circular orbit). 





146 Chapter 4. Addendum 4B 


In the diagram for an inverse fourth law, after V reaches its maximum it 
decrease asymptotically to 0. For a given value of E > 0, the particle can never 





have r; <r < rg. If it starts with an initial value ro < r;, it will stay in the 
region r < r; and eventually “fall into the center”, even faster than with an 
inverse cube force (compare Problem 6). If it starts with ro > 72, it stays in the 
region r > r2 and eventually goes to infinity. 


Rutherford Scattering 147 


ADDENDUM 4C 
RUTHERFORD SCATTERING 


If the conic section for an inverse square force is a hyperbola, then instead of 
equation (bz) on page 125 we have, by Problem 3, 


2 
K 
(b2’) a= a and b=ave*—1, 


e2 — | 





and (b;) on page 124 then gives 
(b3") a= 


For reasons that will appear shortly, the distance from a focus of a hyperbola 
to an asymptote is called the zmpact parameter s, and the angle © in the figure 
below 1s called the scattering angle; its complement ® is the total angle through 





which a particle moving on the orbit turns as it comes in from infinity and then 
moves out to infinity. Problem 3 shows that cos +® = 1/e, so 


TS? ee) CR OD ee 
b* =a*(e 1) = a*[sec xP 1| 
=a’ tan? i® 
ee ee | 
=a” cot” 50, 


and thus 


K 
b = —— cot +O. 
LE 


148 Chapter 4. Addendum 4C' 


On the other hand, a direct calculation shows that the impact parameter s is 
simply b. So 


K 1 
Ss = — cot ee 
2E 


giving the scattering angle © in terms of the impact parameter s and the energy 
per unit mass E. 
These formulas all hold even if K < 0, the only difference being that, as 
indicated in Problem 3, the solution 
2 
(b) pe me 
1+h?Acosé 
is now always a hyperbola, though it is now the dashed branch in the figure on 
the previous page. 

In Rutherford’s experiments, a uniform beam of a-particles (helrum nuclei) 
of known energy was directed at the (much heavier) atoms in a stationary piece 
of gold foil. If g2 is the charge on the nucleus of the gold atom, and q; the 
charge on one of the particles, then the repulsive force between them is g1q2/r? 
(up to a constant depending on the units of charge and force). In this formula, 
the force on an a-particle of mass m isn’t proportional to m, so we should really 


write it as 
lias _ m(qig2/m) 


f(r )= = r2 ) 
and the above formula for the ae parameter s becomes 
(S) s = B® cot Lo. 
IE 


The figure below shows an a-particle approaching the nucleus from a dis- 
tance, with impact parameter s. Its path is the hyperbola with one asymp- 
tote being the horizontal dashed line, and the other asymptote determined 
by the energy of the a-particle, with © being the angle through which the 
a-particle is scattered. Given a-particles of a particular energy E, we can com- 





pute ©(s) for each s, and thus determine how the number of particles varies as 
we vary ©. However, we need a more realistic picture to correspond to actual 


Rutherford Scattering 149 


experiments, where we want to measure the density of particles per unit area 
ofa sphere around the nucleus for the scattered a-particles from a beam initially 
moving in the direction of the horizontal line. 


©(so) 





A stream of a-particles with impact parameters between So and s > So will 
have scattering angles between ©(s) and O(so), with O(s) < ©(So), since a 
larger impact parameter implies a smaller scattermg angle, as shown by (S). If 
there are N particles per unit area, then the number with impact parameter 
between So and s is 22 N(s — So), which must equal the number with scattering 
angles between ©(s) and @(so). If o(@) is the density of scattered particles at 
points of the sphere with a given 6, so that No(@) is the number per unit area 
at these points, then calculating the integral of o da over the corresponding 
region of the sphere by Fubini’s theorem (compare Problem 16), we have 


@(s0) 
2a N(s — So) = / No(@)-2z sin 6 dé 
©Q(s) 


and thus, taking derivatives, we have 


d® 
(1) s = —o(@) sin © 7 





S 


(as usual with Leibnizian notation, © really denotes ©(s), etc.). Similarly, writ- 
ing (S) as 

2Es 

4142 





(2) cot 50 = 


and differentiating, we get 








(3) = sin? 10. 


150 Chapter 4. Addendum 4C' 


Substituting (S), (2), and (3) back into (1) then gives 





ae cot 


(Yoana -0(@) sin @ sin? 1© 
ZE 2 qdiq2 2°° 


leading to the famous Rutherford scattering formula 








1 qiq2\? 1 
o(@) = = ( ) , | 
4\2E 7%  sin* rac) 
For a-particles of energy E approaching the nucleus of a gold atom head on, 
the velocity is 0 at the closest approach, which must therefore be 


q1q42 
E b) 





where the potential energy 1s E. ‘This is pretty clearly the smallest possible value 
for this EF for any impact parameter s, as shown explicitly by the more involved 
formula for arbitrary s given in Problem 24. So if ro 1s the radius of the nucleus, 
then the scattering formula should hold as long as 


qd1q2 be _ q1d2- 


ro 


ro or E 


By examining the scattering results for high values of E, Rutherford was able 
to conclude that the radius of the nucleus must be on the order of 107!7 cm. 


It is often pointed out that the integral of o da over the whole sphere 1s 00, 
and the scattering formula also gives 0(0) = oo. ‘Jo make sense of this, we note 
that to get scattering angles © arbitrarily close to 0, we must have a-particles 
with arbitrarily large impact parameters. ‘Thus we would have to have a beam 
of infinite extent; moreover, although the particles with large impact parameters 
have small scattering angles, most of them will completely miss any particular 
sphere around the nucleus, and would only be detected 1f we made measure- 
ments infinitely far from the nucleus. 


Bertrand’s Theorem 151 


ADDENDUM 4D 
BERTRAND’S THEOREM 


Bertrand’s ‘Theorem is a result of the sort that endlessly fascinates because of 
its elegance, simplicity, and uselessness—the paper Bertrand [1] announcing the 
theorem concludes: “Our illustrious Corresponding Member Mr. Tchebychef, 
to whom I had communicated the preceding proof, sent me the judicious obser- 
vation that the theorem, although useless nowadays for the already so perfect 
theory of the planets, will have a useful application in extending Newton’s laws 
of gravitation to the case of double stars.” 

Bertrand’s Theorem states that the only central forces for which all bounded 
orbits are closed are multiples of either f(r) =r or f(r) = r7~?; in both cases 
the bounded orbits are ellipses, centered at the origin in the first case, and with 
focus at the origin in the second. Many proofs have been given, almost all of 
which first show that the force law must be a power law, and then restrict the 
possible powers to | or —2, and almost all of these proofs proceed by taking only 
a convenient number of terms in various ‘Taylor series, and/or Fourier series, 
without being overly concerned about the validity of the approximation. In fact, 
in physics books the usual analysis of the apsidal angle carries this to the extreme, 
simply replacing x”(t) + g(x(t)) = 0 with the equation x"(t) + g’(0)x(t) = 0, 
totally dispensing with the Period Lemma. 

The following argument uses our results about the apsidal angle to carry 
out the first part of the proof rigorously, and then relies on an argument of 
Arnold [2] for the second part of the proof. We assume as hypothesis that 
our central force has stable circular orbits of any radius, and for simplicity, we 
consider particles of mass m = 1 in all equations. 

For orbits near the circular orbit of radius p, the apsidal angle varies contin- 
uously, and approaches @ with 

3f(e) + pf'(e) _ 1 
f(p) 0, 
But these orbits are closed only when the apsidal angle is a rational multiple 
of z, so by continuity the apsidal angle must actually be this a for nearby orbits. 
Moreover, it also follows that this apsidal angle must be the same for all p. ‘Uhus, 
for A = 1/a? > 0 we have 
ge ‘(e) 
f (pe) 
for all p, or in usual differential equation notation, with y = f(p), 
dy 


y 
dp p 





=A 3 


152 Chapter 4. Addendum 4D 


with solutions y = k - p4~3 for some constant k, or f(r) = k -r473, and we 
have already reduced the possibilities to multiples of power functions 


fry=Tr" with n > —3. 


The case n = —1 can be discarded because it has an apsidal angle of z / V2, 
which is not commensurable with 2. We now show that for n > —1 the only 
possibility is n = 1, while for —1 > n > —3 the only possibility is n = —2; to do 
this, we need to look at the apsidal angle for orbits that are not close to circular. 

If Umin and Umax are Consecutive minimum and maximum values of u = 1/r, 
we can use equation (D) on page 122 to write the apsidal angle as 


[~ hdu 
unin \/2(E —V(1/u)) — h2u2 


the sign of this quantity being irrelevant, since the apsidal angle really refers 
to the absolute value of the angle between the pericenter and apocenter. The 
substitution v = u/Umax then changes this to 


1 


hdv 
(*) 
2(E — V(1/vuUmax)) — h?v2Umax” 


Umin Una 
In addition, since du/d@ = 0 at Umax, equation (C) on page 122 gives 
(*:) 2E = h?umax* + 2V(1/Umax). 


Suppose first that n > —1, son +1 > 0. Since V(r) is a positive constant 
times r”*!, we have V(r) — oo as r > ov, so by conservation of energy all 
orbits are bounded, and thus closed. We now consider orbits with E — ov, 
but with bounded h, obtained by starting at an initial pot, say with r = 1, 
and choosing large initial velocities pointing almost completely outward. Then 
1/Umax = Tmin < 1, so equation (**) implies that we must have Umax — 00, 
and thus also Umin —> 0. 

We can also use (**) to write the integral («) as 


1 
hdv 


? 


> [V1 /atmax) + V(L/v mand] 


Umax 





Umin h?(1 — vu?) + 


Bertrand’s Theorem 153 


so by choosing large E we can get the integral arbitrarily close to 


IU 


Low? 


So the apsidal angle must be 7/2. But it is also w/V3 +7, son = 1. 

For —1 > n > —3, with V(r) = r”t!/(n + 1), we look at orbits with negative 
energy E approaching 0, like the situation shown on page 124 for n = —2; since 
—l1 > n, as long as we keep EF negative our orbit will be bounded, and thus 
closed. For E close to 0 the quantity inside the square root sign of (*) is close to 


2 2 l i yet 
5 V(1/vumax) = —h?v* — = ( ) 


Umax Umax n+1 \ vumax 
l n+l 
cad 


Vn) | 








_fzy2 — 








ee eee) 








=< —h2y2 — y ry) 


ors 
ha 


and (**) also shows that the quantity in brackets is close to —h, so the whole 
integral is close to 


l= 

0 y +l) = y2 

This has the value 2/(3-+7n) (Problem 21), so the apsidal angle must be 27/(3+7). 
But it is also n | /3 +n,son = —2. 


154 Chapter 4 


ADDENDUM 4E 
POWER FORCE LAWS AND DUALITY 


As we noted in Problem 2-4, Newton was able to relate elliptical orbits under 
a central force f(r) = r~*, with the center being at a focal point of the ellipse, 
to elliptical orbits under the force f(r) = r, with the center of the force now 
being at the center of the ellipse, a sort of “duality” between the two forces. 
Moreover, Newton also showed (cf. page 70) that the force f(r) = r~> is “self- 
dual”: the orbits for this one force with two different points as center can be 
the same; in fact, a circle is the orbit for this force with any point on the circle 
as center. 

It turns out that these two results are part of a more general result, which can 
be formulated by considering the orbits as curves in the complex plane. We 
will follow the treatment in Arnold [3; pp. 95-100], which is an exposition and 
extension of a paper by Bohlin [1]. 

We begin by considering the map z +> z + 1/z of the complex plane to itself. 
For a point z on a circle of radius r, we have 


z=rcos@é+irsin§@ 
zt} =r 'cos6—ir 'sin 6, 
so the points 


A ee (r +r") cos6 +i(r—r~') sind 


are on an ellipse with semiaxes a =r +r! andb=r-—r7!. The foci are the 


J > > ] 
Zz we=Zz+- WS Bee ee 
Zz Z 


points (c,0) for c*? = a* — b* = 4, in other words the complex numbers +2. 
We can clearly get ellipses of any shape by this construction. 

As indicated in the figure, squaring such an ellipse takes it into a set of exactly 
the same type, except moved over by 2, so that it has a focus at 0. This argument 
thus proves: Every ellipse with focus at the origin is the square of an ellipse 
centered at the origin. 


Power Force Laws and Duality 155 


We can then provide an analytic version of Newton’s analysis as follows. Con- 
sider a curve t +> w(t) in the complex plane that moves along an ellipse centered 
at the origin, under the force f(r) =r, so that 





(a) d*w —, 
dt2 0” 
with de 
2 
aay | 
| w| Ty 


for a constant h, where @ is the argument for w. 
We consider the new curve W(t) = w(t)”, and want to find a reparameteri- 
zation t so that 














d@® 
Ww? . 
alr oes 
is constant, with © = 26 being the argument of W(t). So we want the quantity 
d® d@ [dt dt 
Wi? .§— =2lwl*- — / — =2lwl?-h/ — 
WI dt a dt/{ dt dt 
to be constant; we can simply take dt/dt = |w|?, so that 
qd id 
dt |wl? dt’ 
‘Then we have 
d*W ld i die 
dt2  — |w|2 dt \|w|? at 
—_ 2 d (\dw 
Iwi? dt \w dt 
oe 1 d*w 1 dwdw 
— |wl2 \w dt2 ww? dt at 
2 bape 1 dwdw ner) 
= —— [ — = sing (a 
w2\o | w2 dt dt ree 
2 >. |dw ; 
~ we (1 ars ) 


But the term in parentheses is the constant 2F', by conservation of energy (for 


the original force f(r) = r), so we have 
d*Ww — 4E  (4E)W 


dt ww—s«WWW J? 


so that W is an orbit for an inverse square force. 


156 Chapter 4. Addendum 4 


More generally, it is possible for the orbit under a force proportional to r? to 
become an orbit for a force proportional to r? under a map w +> w®: This will 
happen whenever 
a+3 

> 

For the calculations, similar to ones on the previous page, we are considering 
a curve w satisfying 





(«) (a+ 3)(a+3) =4, b= 


d*w 
(b) a —w|w 


va, 


with |w|? -d6/dt constant, as before, and the conservation of energy equation 


) 
We consider the curve W(t) = w(t)’, having corresponding argument © = B8@. 
To make |W |? -d@/dt constant, we take dt/dt = |w|?4-) = |w|¢t!, or 


d 1d 
dt |wl@+) dt’ 
d*W l 


7 1 dw 
dtz Jw etd lwl@t! dt 
— Bd { wh dw 
= Jwl@+! dt \ |wl@t) dt 


_ p d 1 dw 
~ wlth de \we@tv/2 dr 
_ &£B 1 d*w a+t+i l dw dw 
~ |wib+t \p@tbd/2 di2 2) H-(@+3)/2 dt dt 


) 


after taking into account both (b) and (x), which gives (a + 1) = B—1. Using 
the conservation of energy equation, together with (*) again, we finally obtain 
the desired result 


dw 


2 
2E = | ——=|w |? + | — 
(35 | +|5 





Then we have 











which, finally, 


_ BB =1)w* ( 2 
= Tare? (a7 


dw 


atl aw 
jwlatt + = 





d*W 


7a = ~2EB(B - Dwiw et. 





Power Force Laws and Duality 157 


Our initial result relating orbits under an inverse square force to those under 
a direct first order force corresponds to a = 1,a = —2, while Newton’s result 
about inverse fifth powers corresponds to a = a = —5 (see also Problem 9). 

It should also be noted that when our calculation involves orbits with energy 
E = 0, we obtain d*W/dt” = 0, which are simply straight lines. Or, working 
backwards, orbits of energy E = 0 are the images of straight lines under the 
map wt> w!/?, where 





1 862 
B at+3 
In particular, for an mverse square force, a = —2, the parabolas are the images 
2 


of straight lines under the map z b> z”. 


158 Chapter 4 


PROBLEMS 
1. (a) If the acceleration a = c” of a curve 
c(t) = r(t)(cos 6(ft), sin O(t)) 


is decomposed as ay + ag, where a; points towards the origin and ag is per- 
pendicular to it, show that 
ap =r" —rO"?. 


(b) Let —P be the magnitude of a force that is not necessarily radially symmetric 
but is a central force (with P > O for attractive forces), so that we still have 
r*0' = h for a constant h. Show that the equation of motion can be written in 
terms of u = 1/r as 

du _ P(l/u) 

dé? — mh2u2’ 





generalizing equation (F). 


2. (a) ‘The equation | 
= = B+ Acos20 
; 


is an ellipse centered at the origin when A < B and a hyperbola when A > B. 
(b) For the attractive force f(r) = mKr all orbits have the same period. 

(c) The orbits for the repulsive force f(r) = —mKr are hyperbolas. (A modi- 
fication of the argument in Problem 2-4 (a) is also possible.) 


3. Consider an ellipse with one focus at the origin O, and the other at (—2ea,0), 
for which the sum of the distances r and 2a —r from the foci to any point (x, y) 





is the constant 2a (which requires 0 < € < 1). 


(a) The distance r satishes 


r= (l—e’)a—ex 
= N= €X, for A = (1 —«7)a > 0, 


and thus re 
r= ———_. 
1+ecos0 


The One- and Two-body Problems 159 


(b) One semiaxis of the ellipse is obviously a = A/(1—e7). The maximum and 
minimum values +b of r sin @ occur when cos 0 = —e, so the other semiaxis b 


satisfies ie 
b = ——— =avl-é. 
V1 — 2 
Thus, the ellipse (x/a)* + (y/b)? = 1 has eccentricity e = V1 — (b/a)?. 
(c) If we choose our origin at the center of the ellipse, (—ea,0), and let x, y 
denote coordinates with respect to this origin, then x = x + ea. Conclude that 
we have 
=a eX, 


(d) Now consider a hyperbola with one focus at the origin, with the difference of 
the distances being 2a, and the other focus at (—2ea,0) (which requires € > 1). 
For the branch of the hyperbola consisting of points where the distance to the 
origin minus the distance to the other focus 1s a, show that we obtain the same 
equation as for the ellipse, except that A = (1 — e?)a < 0. On the other hand, 
for the other branch we get the same equation, but with A = (e* — 1)a > 0, so 
that a = h?/(1 — 7). 

(e) For this branch, since we must have r = A/(1 + €cos@) > 0, and thus 
1+ ecos@ > 0, conclude that the positive angle 6 that one of the asymptotes 
of the hyperbola makes with the x-axis satisfies cos@ = 1/e. Comparing with 
the asymptotes of the hyperbola (x/a)? — (y/b)? = 1, conclude that this has 
the same shape when b = ave? — 1, so that the eccentricity is given by ¢ = 
V¥1-+ (b/a)?. 

(f) Consider the parabola consisting of all points (x, y) whose distance from the 
origin is equal to the distance from the line y = a. Show that we obtain the 
equation a = r(1 + cos@), again of the same form. 


4. Every hyperbola has a conjugate hyperbola with the same asymptotes, the 
same distance between its vertices, and the same distance between its foci. 
In part (c) of Problem 2, show that for a given E and h, the orbit has two 
possible shapes, a hyperbola and its conjugate. 


Qe. ge 
conjugate | —~ — — = 1 
Jus a> Bb? ve 
Y 2 
hyperbola 2 — 2 = 4 


160 Chapter 4 


5. (a) For an inverse square force f(r) = mK/r?, use equation (D) to get 
—hdu 
J2E + 2Ku—hu2 


A table of integrals shows that for a < 0 we have 


( 2ax+b ) 
arcsin {| ——————— } . 
V b2 — 4ac 


é= 





| dx _ ] 
Vax? +bx+c v—a 
Use this to obtain the orbits. 

(b) Use equation (D) to show that the orbits for a repulsive inverse square force 
are hyperbolas, each of which is an orbit for the attractive inverse square force 
with the center at the focus of the conjugate hyperbola. 


For the next few problems, recall that for the graph r = f(@) in polar coor- 
dinates, the length on [6, 9;] 1s given by 


[PAP 
90 


For simplicity we will examine the inverse cube force with K = 1. 


6. (a) Consider the logarithmic spiral r = ae”®, that is, the path c(@) = 
(ae’’ cos 0, ae’’sin 6). Show that the radius vector from the origin to c(6) 
makes a constant angle ¢ with the tangent vector at c(@), where cos@ = 
y°/(l+y*). 

(b) Show that the length of the spiral from any point to the origin, 1.e., 


0 
| Vv fe+ Ff", f(0) = ae”? 


is finite. It is therefore hardly surprising that the orbit “reaches” the origin 
(or “falls into the center”) in finite time. The equation r’? = (1 — h?)/r? 
[equation (c) on page 126, for C = 0] gives 

dr V1—h? 

dt ro 
Solve for r(t) and conclude that if the particle begins at distance ro from the 
center, it will fall into the center at time tf = r9?7/2V 1 — h?. 


7. (a) The hyperbolic spiral r = c/@ does not spiral infinitely often as we go 
to infinity; instead it approaches the line y = c asymptotically. 

(b) The length of the spiral from any point to the origin is infinite. 

(c) Nevertheless, the orbit on page 126 falls into the center in finite time. 


The One- and Two-body Problems 161 


8. The length of the Cotes’ spirals from any point to the origin is also infinite, 
but the orbits again fall into the center in finite time. 


9. (a) Problem 2-3 presented Newton’s proof that if a particle moves in a circle 
under a central force directed to a point of the circle, then the force must be an 
inverse fifth power. ‘To derive this from our equations, we consider a particle of 
mass m = | travelling on a circle of radius | around the origin, starting at the 


YQ 


(1,0) 


point (1,0). Letting 6 be the angle from the vertical to the radius vector, show 
that 


r(t) = 2sin 0(f) 
r'(t) = 2cos 6(t)0’(t) 


and use the second form of equation (A) to obtain 


4h? 


b) If conversely, we start with the force f(r) = r~°, use equation (E) to obtain 
y: q 


du\* u* os 2E 
on = — —_- —- yf —— 
dg 2h2 h2° 


and for r = 1/u obtain 


AeX2 
ie con ee 
dé h? 2h? 
(this has essentially already been done in the equation at the bottom of page 121). 
(c) Conclude that the inversion through the origin of an orbit is also an orbit, 


and use this fact to give an immediate proof that the unit circle through (1, 0) 
is an orbit. 


162 Chapter 4 





. (a) For two particles cj and cz of masses m; and mz let f(r) = Gm\m>/r? 
be the gravitational force between them, where r = |c1—c2|. Then the equation 
for c” = (c1 — C2)” derived on page 136 can be regarded as the equation for a 
particle of unit mass under an inverse square force of magnitude G -(m , +2). 
Conclude that the distance a between c; and c2 and its period t are related by 





(b) If M is the mass of the sun, and m; (i = 1, 2) are the masses of two planets, 
whose orbits have semimajor axes of length a; and periods 1;, then we have the 
more exact form of Kepler’s third law 


M+m, {a : T? 
M+m2 \ao T] 
(c) Let m be the mass ofa planet, whose orbit around the sun has a semimajor 


axis of length a and period t, and let m’ be the mass of a moon of that planet, 
whose orbit around the planet has a semimajor axis of length a’ and period 1’. 


‘Then 
m+m’ _ wel eye 
Mim  \a oe 


m a’ f(t’ 
ried 9 eo i 


This provided an accurate way to determine the mass, relative to the mass of 
the sun, of all planets except Mercury, Venus, and Pluto [which actually turns 
out to have moons]; their relative masses have now been measured by observing 
their effect on spacecraft. 

(d) Using the equations for y; = c; — C derived on page 136, find the ratio of 
the periods of the two body’s orbits around their center of mass in terms of the 
ratio of their masses and the semimajor axes of those orbits. ‘The ratio of the 
mass of the moon to that of the earth 1s .0123, and the period of the moon’s 
orbit is very close to 39,360 minutes. Find the period of the earth’s orbit around 
the center of mass of the earth-moon system. 


and hence 





~ 11. Determining the orbit of a planet or a comet from only a few observations 
requires the solution of a problem considered by Kepler, to find the time f(@) 


The One- and Two-body Problems 163 


required for a planet P to go from its perihelion A around the sun S to its 


position with a particular angle ZASP = @. However, Kepler didn’t work 
directly with 6. Drawing a circle whose diameter is the diameter of the elliptical 
orbit, with center C, and letting Q be the point of the circle on the perpendicular 
to the diameter through P, we consider the angle 9 = ZACQ at the center C, 
the eccentric anomaly, as opposed to @, the true anomaly. We will first find the 
connection between 6 and 6, and then derive Kepler’s equation, which gives a 
formula for t(0). 


(a) By Problem 3 we have 
a(1— 7) 
r= ————.. 
1+ ecosé 


In addition, part (c) of the problem shows that for x = CQ’ we have r = a—ex. 
Conclude that 


r=a(l — cos), 


and then that 
(1 — ecos @)(1 + €cos6) = 1—€?, 


which can also be written as 





6 |—¢ 0 
tan — = tan —, 
2 l+e . 
or 
~ V1—e%sin8@ 
sin 9 = ————_ 


l+ecosé — 


164 Chapter 4 


(b) Since the area swept out by the planet is proportional to the time, and the 
total area mab is covered in the period t, for the sector ASP we have 


t(0) = tf area ASP 
mab 





t b 
= ——--—area ASO 
mab a 


= ——[area ACQ — area ScQ| 
Twa 
i las. Gee ~ 
= —— | —é — —sin@]. 
2 5 6G 5 Sin 


Recalling the formula for t on page 125, conclude that we have Kepler’s equa- 


tion 
. [Be ae 
t(@) = x (4 —esin 8). 


Thus we have to solve a transcendental equation ¢ — esing@ = constant, for 
which numerous numerical methods have been devised. 

(c) For an analytic derivation of Kepler’s equation, we start with equation (A) 
on page 121, and separate variables to obtain 


a dr 7 rdr 
_ OK p2 J/2Er2 +2Kr —h2- 
Oo i arama 
ror 
Use (bi) on page 124 and (b3) on page 125 to get 
ee 
ae) ee ee 


Now use r = a(1 —e cos 8), derived in part (a), for a substitution in the integral. 
The denominator should simplify to /a ¢sin 6, and you should end up with 


a? — 
t= ,/— | (l—ecos@)dé. 
K 


12. ‘This problem, taken from Palais and Palais [1], proves continuity of solu- 
tions of differential equations with respect to the defining equations. 


The One- and Two-body Problems 165 


(a) Let f and g be continuous functions on [a,b] with g nonnegative. Show 
that on [a, b] we have Gronwall’s inequality: 


If faeces | fe, then $oaecel® 


(usually stated, and used, only for C => 0). Hint: Consider the derivative of 


h(x) =(C + f* faje ta ®. 


In particular, for K > 0 
Xx 
If f(x) <C+ x | f, then f(x) < Cek*. 
0 


(b) For simplicity, we will consider a system of differential equations determined 
by a function ¢: R” — R” without worrying about details concerning the 
actual domain of @¢, etc. We will use | | for the norm on R” and assume 
that @ satisfies a Lipschitz condition |¢(x)—¢(y)| < K|x—y|; solutions c, with 
Cx (0) = x, c(t) = o(cx(t)) will be assumed extended to a maximal interval. 
All the considerations for these time-independent equations will also work just 
as well for the time-dependent case. 

‘To make the role of the “defining equation” ¢ explicit, we will use cf for the 
curve with 


cEO)=x, cf") = $(c$(). 
Now suppose we have y: R” — R” with 


lp(x) — W(x)| <e for all x. 


Prove that ; 
lee @) —eF Ol s Ee™ — 1), 


Hint: Writing u(t) = ice (t) — cv (t)| + %, the conclusion may be written 
as u(t) < (e/K)e**', which follows from Gronwall’s inequality provided that 
u(t)<~+K fo u(s) ds. Note that 


u(t) — = = eK OL = | HL) - WEL Eas, 


and write o(co (s)) — wick (s)) as asum of two terms that can be estimated. 


13. For the equation x”(t) + g(x(t)) = 0 with g(0) = 0 but g’(0) < 0, use 
Problem 12 to prove that solutions x with x(0) = 0 will not remain small no 
matter how small we choose our initial value x’(0). 


166 Chapter 4 


14. Consider a central force with f(r) = r? — r, with corresponding potential 


determined by F(r) = +(1 —r*)?, a “Higgs potential”. 


(a) Show that for p close to 1, orbits close to the circular orbit at radius p have 
an apsidal angle a that is very small, so that the orbit oscillates around the 
circular orbit many times (as in the picture on page 131). 


(b) But the semiperiod o approaches z /V2 as p approaches 1; this apparent 
discrepancy is explained by the fact that 6’ approaches 0 as p approaches 1, so 
that the orbits are being transversed more and more slowly. 


That might lead us to conclude that motion on the orbit of radius 1 is infinitely 
slow, which is indeed true in a sense: There is no circular orbit of radius 1; since 
the force along the unit circle is 0, the only orbits that stay on the unit circle are 
the ones that stay at a single point of the unit circle (note that the argument on 
page 121 doesn’t apply in this case!). 


(c) But there are still orbits that oscillate around the unit circle. Such orbits can 
be found with arbitrarily small apsidal angles a, though always with semiperi- 
ods o close to x //2. 

15. For the force 


l C 
tS oes ar 


for a constant C, solve equation (E) explicitly to get 


h? —C C 
r= y=,l-— 
1 + A(h? — C)cos y6 h? 
a(1 — 6) ‘ — ee ee eee 
= for an “eccentricity” ¢ and “semimajor axis” a. 
1+ ecosy@ 


Since the maximum of r occurs for y@ a multiple of z, this 1s a precessing 
ellipse with apsidal angle of 2/y. Check that as ¢ — 0, this approaches 2a for 
the a given by (*) on page 132. 


16. Consider a particle of mass m at distance r from the center of a sphere of 
radius R having total mass M, and thus density M/42R?. Allowing ourselves 





The One- and Two-body Problems 167 


the luxury of a little rigor slippage, the mass of the shaded sector in the figure 
is approximately 





M M 
Ape eR sin 0)(RAB@) = > sin GAG, 
so the potential function V for the force on the particle 1s 
GmM [” sin®@ 
V(r) = ea = dé, s(9)* = R? + r* —2rRcos8. 
2 0 s(@) 


(a) Using the substitution s* = R? + r? —2rRcos6@ in the integral, show that 


ds 





GmM [~ 
i; 


ie aaa 


—R 


to prove that the total force on the particle is GmM/r?. 
(b) For r < R we have 


ds, 





v( GmM [- 
ry: = 
2rR R-r 





so V is constant inside the sphere, and the force is 0. 


17. (a) For an arbitrary potential V, written as 





at a point at distance r from the center of the shell the value of V will be a 
constant times 


I 
spl + R)- fr— Rh 


(b) Suppose this is a constant C for r < R. Conclude that 


| 
C= 5,25 (2r) 
and then that 
RfQr)—rf(R+r)—rf(r—R) =0. 


(c) ‘Then show that 
Fr + RY = fr —R) 


and conclude that f” is a constant, so that V is a multiple of 1/r. 


This proof, attributed to Laplace, is quoted in Maxwell [1]; pg. 422]. 


168 Chapter 4 


18. A ball is dropped into a hole drilled straight through the earth. Assuming 
that the earth has constant density, determine how the force on the ball varies 


with its distance from the center of the earth, and describe its motion. 


19. In the figure below, with a sphere S of radius r surrounded by the surface 
0B, a small cone from the center of S intersects S in a region a that 1s close to 





a rectangle, and it intersects dB in another such region A at distance R from 

the origin, on which the normal v makes an angle of 6 with the normal of the 

region on S. If |F| varies inversely as the square of the distance from the center 

of S, show that the flux of F through the region a equals the flux of F through 

region A. Intuitively, if we think of particles being emitted from the origin at a 

steady rate, then the flux through any surface surrounding the origin 1s just the 
re total number of particles emitted per unit time. 





~ 20. ‘The earth’s tides, or more generally, “tidal forces’, are entirely due to the 
fact that gravitational forces are not uniform. The moon’s gravitational force 
on the parts of the oceans nearest the moon is greater than the force on the 
solid part of the earth, so the water bulges towards the moon, while the force 
on the parts furthest from the moon are less than the force on the solid earth, 
so they are pulled less and bulge away from the earth. For a computational 
analysis, in the figure below the plane of the paper contains the centers of the 


earth 
and 
oceans 





The One- and Two-body Problems 169 


earth and moon; the dot at the middle of the earth is close to, but not exactly 
at, the north pole. Let m be the mass of the moon, D the distance between 
the center of the earth and the center of the moon, and r the distance from 
the center of the earth to a particle on the equator making an angle 6 with the 
line from the earth to the moon (if our dot were the north pole, @ would be the 
particle’s longitude), with x the distance along this axis from the center of the 
earth. 


(a) ‘he distance from the center of the moon to this particle is 
(r2 —2rDcos6 + D?)!/?, 


and since r < D, for the potential function V we have 


1 r r? 
py? és oy, ives —_(32¢os79—1 
Vi om| 5 + pez Cos + Fa 62 098 eee | 
The constant —Gm/D is irrelevant, and the term —Gm(r/ D7) cos 9, which can 
be written —Gmx/ D7, corresponds to the constant force Gm/D? towards the 
moon, which we subtract in order to see the effect with respect to the carth’s 
position, leaving us with 


2 
Vobserved © <8 cos* 6 — 5). 
Notice that this gives the potential function Vobservea aS a function of the dis- 
tance r_from the earth’s center. For this approximation we can just as well let r 
simply denote the mean radius of the earth. 
(b) ‘To determine the height (@) of the tide above the average radius of the 
earth, we use the fact the surface of the water should be an equipotential surface 


h(@) 





earth 
and 
oceans 


for the difference potential Vobservea + Gme/r?, where me is the mass of the 
earth. Using the fact that h(@) is small compared to r, show that 


mr* 





h(@) = 


3 2 1 
2 cos’ 6 — =). 
Me aS 3) 


170 Chapter 4 


(c) If the particle is not on the equator, but has “latitude” @, then (3 cos* 9-3) 
should be replaced in all the formulas by (3 sin* ¢ cos? 6 — 1). 


The gravitational force of the sun on the earth is about 175 times as great as 
that of the moon, despite its much greater distance. But that greater distance 
also diminishes its tidal effect. ‘To compare the effect of the moon and the sun, 
note that, apart from the factor 3 cos” 0d — 5, the ratio h(@)/r is mr?/meD?. 
From the data m/me = 1/81.3 and r/D = 1/60.3, this factor 1s 


mr? 


m.D3 = 5.6 x 10-8. 
Me 


If M is the mass of the sun, and a 1s the semimajor axis of the earth’s orbit, 
then M/m, = 3.33 x 10° and r/a = 4.26 x 107°, and the factor is 


Mr? 6 
5 = 2:57 10", 
Mea 





so the effect of the sun 1s somewhat less than half that of the moon. 

The earth of course exerts tidal forces on the moon, causing it to have a 
slightly ovaloid shape, with the narrower end tending to be pulled toward the 
center of the earth, which has resulted in the moon’s revolving exactly once for 
each revolution around the earth, so that the same side is always facing us. 


with a substitution of 


dx dx 
21. For B > 2, evaluate f —* = | a 
Vv xP — x2 xv xP-2 —] 
the form u ... to get an answer involving arctan, and with a substitution of 
the form y = x%, for suitable choice of A, to get an answer involving arcsin. 


2 — 


22. This problem gives an elementary geometric argument, due to Hermann 
Karcher, for the central force needed to produce an elliptical orbit when the 
center is one focus of the ellipse, by determining the potential energy function, 
rather than the magnitude of the force directly; 1t bears comparison with New- 
ton’s alternate proof in Problem 2-4, and uses an elementary geometry theorem 
very much like the one in Problem 2-3: for two secants intersecting inside a 
circle, as in the figure on page 137, we have PA. PB = PC - PD. 


iv, 
“ai 


D 


The One- and Two-body Problems 171 


(a) Suppose our ellipse has foci Fy and F2, with Fy P + PF2 = 2a for all 
points P on the ellipse, and let F; Fz = 2e. We assume that our central force 
is directed toward the focus F;, and we consider a circle of radius 2a around 
the other focus F2. Extend PF, to intersect the circle at A and G, and then 
extend AF to intersect the circle at D. Show that the three angles indicated 
by small arcs, involving the tangent line to the ellipse at P, are all equal. Hint: 
Compare Problem 2-4. 





(b) Show that AP and FP have the same length r, so that the two segments 
AB and BF; have the same length p, which 1s the distance from Fj to the 
tangent line. 

(c) Using the elementary geometry theorem mentioned before, show that 
2p-F,D = (2a —2e)- (2a + 2e), which we can write as 


(1) p-F\D=a, 


for a constant a, while conservation of angular momentum tells us that for the 
velocity vu at P we have 


(2) p-v= 6 


for some constant 6 [so that fF; D must be proportional to v, although only 
equations (1) and (2) will be needed]. 
(d) Using similar triangles, show that 


2n+F,D 1 | au 
eGo aa, 


r 4a r 2a 4aB2’ 


and conclude that 1/r is proportional to the potential. 


172 Chapter 4 


23. This problem outlines a proof that the only central forces for which all 
orbits are conics are, once again, multiples of either f(r) =r or f(r) =r-7,a 
question that Bertrand later posed. This proof, together with remarks on proofs 
by Darboux and Halphen, and references to analogous results, appears in the 
classic book Appell [1; Vol. 1, sect. 232]. Not surprisingly, the proof is a lot more 
algebraic in nature than the proof of Bertrand’s ‘Theorem itself. 


(a) A conic can generally be written as the graph of the function y defined by 


y(x) =ax+ B+ Vax2+2bx +c. 


Compute y” (carefully!) and conclude that y” —3 is a second degree polynomial 
p y sree poly 


in x, and thus that y satisfies Halphen’s 5 order equation 


H (y'-3)” =o. 


(b) Let c(t) = (x(t), y(¢)) be a solution to 


ar, S2er) pears 


dt? r(t)? dt? r(t) 


Then, by conservation of angular momentum (cf. equation (A;) on page 83) 


x—-~—y— =a for a constant q@. 


1 
Xo-, Ye, T=--, 
y y ¥ 
show that 
aX . dY _ dy 
dT °° dT at 
and thus 


ae Xx d*Y d? : 
0 _ d*y dt _ re 








dT? dT? deat oF 
This means that we can replace our problem about central forces with an equiv- 
alent one involving forces parallel to the vertical axis. 
So we will be considering a function G such that x(t) and y(t) satisfying 
dx d*y 


=0, = Gat). ») 


The One- and Two-body Problems 173 


always lie on a conic. 


(c) We have 
dx d*y 1 
M) da dx? 
Hence, if we write 
_3 2 _2 
(2) G=plo(x,y)] 2, Gs=p 3d(x,y) 


|" 


for some constant jz, then equation (H) gives [@(x, y)|"” = 0, where primes now 


denote derivatives d/dx, so that we have, for example, 


, _ 96 ap / 
(P(x, y)]} = at ay? 


Compute ¢” and ¢’”, use (1) and (2) to obtain 
i Ee ee —3 (+f), 














a? 2 a2 Ox dy 
and then reduce the equation ¢”” = 0 to 
ao aro aro Po 
ee et 3 / 3 12 alee 
0x3 ae dx? dy vo dx dy? 4 dy? 
Bu a (,, Hh If dP) , Buy’ 3], 0° (d6\* 
sl 2 maid 7H anlar (eke A en 
Y Do? '( ? ax ay Ox i) + 2a? © Pay? (5) 


(d) Using the fact that this equation holds for any initial values of x, y, y’,a, 
conclude that we have 











0° 03 03 a? 
() ot gy 2%, fy iG, 
0x3 0x? dy dx dy? dy 
ap apa 0? ab \? 
(ii A SL Pes oP — 0) 
Oxdy dx dy dy? dy 


Note that (1) shows that ¢ is a second degree polynomial in x and y, 
d(x, y) = Ax* +2Bxy + Cy*+2Dx+2Ey 4 F. 


(e) Use (u) to show that 


Kea = (Bx + Cy +B), C #0 





Ax* +2Dx + F, C= 0; 


174 Chapter 4 


In terms of equation (2) we have 


uC? 
is (Bx + Cy + E)?’ 
ee cee eee 
(Ax2 +2Dx + F)2_ 


C #0 


and the corresponding central forces are 


urC3 
(Bx + Ey+C)?’ 
ng 
(Ax? +2Dxy + Fy?)2 


C £0 
F= 


(f) Assuming (as usual) that the central forces are radially symmetric, this gives 
us the desired result. 


24. (a) In the figure on page 147, the nucleus of the gold atom is at (ae, 0), and 
the periapsis of an a-particle on the dashed path is thus a + ae. Remembering 
that our K = qigq2/m, use formula (b’) to get 











_ h2m 
(e? — 1)qiq2 
and thus 
h2m 
e=,/1+ 
qdiq2a 
(b) Using (b3’), show that 
2mh2E 
a 1+ - é 2 
2E (91942) 


The smallest periapsis clearly occurs for h = 0, and has the value 


qd142 
FE 


(c) Show that this is the same as choosing s = 0 (compare Problem 3-27). 


CHAPTER 5 
RIGID BODIES 


he previous chapter, illustrating the power of Newton’s laws in analyzing 

point masses, or objects that behave in certain respect like point masses, 
would usually be regarded as material for intermediate or advanced mechanics 
courses. 

But, as we have pointed out numerous times in the previous chapters, many 
of the “elementary” problems of mechanics do not involve point masses, and 
instead require the analysis of rigid bodies. Since the study of rigid bodies is also 
generally regarded as an advanced part of mechanics, elementary mechanics 
books focus on special cases— often with various unstated assumptions—in order 
to have some problems to solve. 

Fortunately, we needn’t feel deterred by the use of slightly advanced mathe- 
matical notions, so we will be able to examine the basic assumptions that under- 
lie the treatment of rigid bodies without the distraction of various superfluous 
considerations. 

Rigid bodies are obviously idealizations, since in practice nothing is perfectly 
rigid. (In fact, in special relativity theory rigid bodies are actually impossible 
even in principle, though we won’t be worrying about that here.) Our aim, 
therefore, is not to produce a “realistic” model of a rigid body, but to define the 
proper abstract concept that corresponds to it. 


Equilibrium. Before trying to analyze the motion of rigid bodies in general, we 
first ask when a rigid body should be in equilibrium under certain forces. As 
the simplest possible example, let’s consider a “rigid rod” that consists of just 
two points b; and b2 (representing two molecules, say) at a distance d apart, 
and “external” forces F; acting on b;. These forces might be produced, for 


F, F 
b; oe _s, bz 


example, by some one exerting equal but opposite pressure on both sides of this 
rod. 


If F) = —F2, then we would expect this rigid rod to be in equilibrium under 
these forces, and we can justify this expectation by noting that if we consider a 


1795 


176 Chapter 5 


force Fz; on bz equal to —F2 and a force Fy2 on by equal to —F,, then these 
“internal” forces Fj2 and F2; do satisfy Newton’s third law, and together with 
the forces F; and F2 they leave our rod, consisting of b; and ba, in equilibrium. 


F, F, 
bi @- -—_—____-__—-@b: 
<—— —_—_——>> 
F 12 F), 


‘To be sure, as anticipated in Problem 1-4, this picture becomes quite a bit 
hazier if we try to imagine how these “internal” forces would arise as the forces 
F; are applied. Presumably the internal forces are 0 when the two molecules are 
at their “natural” distance d apart, but become strongly repulsive if the distance 
is slightly smaller than d and strongly attractive if the distance is slightly larger 
than d. So, the forces F; initially push the molecules slightly toward each other; 
as this happens, the molecules produce large repulsive forces, which will not 
only return the molecules to their original position, but actually cause them to 
move slightly further apart; this, in turn, will produce large attractive forces, 
now moving the molecules back toward their “natural” separation, and slightly 
beyond, causing the repulsive forces to act again. ‘Thus, we would expect the 
molecules to vibrate around their natural separation, which is more or less what 
actually happens in our real-world approximation to a rigid rod. 

We might hope to describe an ideal rigid rod by considering the limiting sit- 
uation as the constraining forces of the molecule are made greater and greater. 
But increasing the constraining forces simply causes the molecules to vibrate 
more and more rapidly—although they will stay closer and closer to their nat- 
ural separation, their motions will not approach a limit. 

So instead, we will consider our abstract rigid rod to be in equilibrium 
simply because such forces F;; can be defined, without worrying about the details 
of just how these forces would actually arise in practice, for rods that aren’t 
ideally rigid. 

More generally, let us consider a collection of points by,...,bx, which it will 
sometimes be convenient to regard as a single object, b = (by1,..., bx), as well 
as a collection of forces F = (F,...Fx), where we regard F; as acting on bj. 
‘Then we can make the following definition: 


The collection of points b is in rigid equilibrium under the forces F if there 
exist “internal” forces Fi; = —Fj;; which are multiples of b; — b; such that 


Fj =—) Fj. 
J 


Much more colloquially, of course, we just say that “the rigid body b is in 
equilibrium under the forces F”. 


Rigid Bodtes 177 


An important point about this definition is that the masses of the points b; do 
not play a role—though the forces might very well depend on those masses (for 
example, in a gravitational field); the masses will enter the discussion later on. 

Another important point about the definition is that it is inadequate. For 
example, we presumably ought to have equilibrium for the rod shown be- 
low, where there are equal forces F at the ends of the rod together with a 


—2F 


b; bo b3 
Tr Tr 


force —2F in the middle. But these forces obviously can’t be balanced by 
forces that are multiples of the vectors b; — b;. Of course, in practice, the rod 
will bend a bit, and in this situation the necessary “internal” forces will exist. 


ae 
F = —F,. — Fy3 a =4 
i 
F F cele 
by a PS by F = —F3,; — F32 0 = 


— Fy3 


Fortunately, we can stick with our strict theoretical model if we represent the 
situation by a slightly more realistic figure, with a few extra “molecules”, so 


—2F 
b@ b2 b3@ 


by b,’®@ b3’ 
F F 


that once again the required internal forces will exist. 


b; bo b3 
e Ax e 
@ 22225 == > @ <-------- > @ 
by’ b’ b3’ 


We will normally presume that our particles do not lie on a straight line, or 
even on a plane, and in realistic situations the number of particles should be 
much greater, although special cases may be useful for illustration. 


178 Chapter 5 


It should also be noted that the F;; of our definition are almost never unique, 
even for the special case of a “rigid rod” consisting of particles bi,...,bx lying 
on a straight line. Given equal and opposite forces F and F’ on the ends by, 
and bx, we could choose just two forces Fyx and Fx; between b; and bx, 


F b; b2 b3 br F’ 
—> e e <— 
—— Sy 

Fix Fx 


essentially ignoring all the particles between them, but it would be more natural 
to balance F with a force F2; exerted on by; by bz, requiring an equal but 


F 5 b2 b3 be pv 
>» @ e e e e oe <— 
<< a> 
F, Fi2 
F32 Fr K-1 


opposite force Fy2 on bz, which would in turn be balanced by a force F32 
exerted on bz by bs, ... , leading finally to a force Fx,x-1 exerted on bx by 
bx_—1 that balances F’. 


Virtual infinitesimal displacements. Our condition for ngid equilibrium intro- 
duces a whole set of unknown forces F;;, but we can obtain a consequence of 
this condition that does not involve these unknown forces by considering “rigid 
motions” of b. By this we simply mean a collection of paths ¢ = (c1,...,cK) 
with c;(0) = bj such that each 


lei (t) — ¢; (t)|* = (ce; (t) — cj(t), ci(t) —c;(t)) 1s constant. 


Alternatively, we might think ofa rigid motion as a curve t +> A(t) of isometries 
of R?, with c;(t) = A(t)(ci (0) = A(t)(b:). 


Given such a rigid motion, consider the K-tuple of tangent vectors 
v =(v1,...,VK) = (c1/(0),...,cx’(0)) € (R37). 
Differentiating the equation 
(ci(t) — cj (t), ci (t) — cj (t)) = constant 
and evaluating at 0 gives us 


(1) (v; —v;, b; —b;) = 0. 


Rigid Bodies 179 


We have always drawn our forces as if they satisfy the “strong form” of the 
third law, stated on page 25, and mentioned frequently in Chapter 3, and we will 
now specifically assume this, leaving further consideration of this assumption to 
Addendum A. Then, since the force Fj; is a multiple of b; — b;, equation (1) 
implies that 

(Vv; as Vs F;;) = (). 
Consequently, 
>i Fiz) = D (vy, Fis) = — Dy, Fy) 
I,J L,J LJ 
= —) (vi, Fis) (interchanging i and /) 
i,j 


and thus 


>_ (vi. Fis) = 0. 
IJ 


This in turn means that the external forces Fx in the condition for rigid equi- 


hbrium satisfy 
>, We: Fe) = a) ae (Vi, Fei) = 90, 


or simply 
(¥) SY“ (ve, Fe) = 0. 
k 
Physicists refer to these K-tuples v = (cy’(0),...,¢x‘(0)) for rigid motions c 


of b as “virtual infinitesimal displacements” of b. ‘The word “infinitesimal” 
in this phrase shouldn’t surprise us—it’s Just the standard physicists’ way of 
referring to tangent vectors. As for the word “virtual” here, it has about as 
much meaning as it does in the phrase “virtual reality”. Basically it refers to 
the fact that although we have obtained equation (*) under the assumption that 
our rigid body is in equilibrium, we have done so by considering tangent vectors 
to “virtual” rigid motions, 1.e., motions that our rigid body might have had if it 
weren't in equilibrium. 


Configuration space. ‘his can all be expressed in a more familiar, geometric, 
way by considering the “configuration space” of b, which is the subset M C 
(R3)* of all points that can be reached from b at the end of a rigid motion. In 
other words, 


M = {(A(b1),..., A(bx)) : A an orientation preserving isometry of R*}. 


When b is non-planar, M is a 6-dimensional manifold diffeomorphic to the set 
of all orientation preserving isometries A of R°, and thus to R* x SO(3). With 


180 Chapter 5 


this picture, a rigid motion of b is simply a curve in M, so a virtual infinitesimal 
displacement v of b 1s simply a tangent vector to M at b. 
We’ve already found that any such v satisfies the equation 


(1) (vi — vj, bj — bj) = 0. 


If we define linear functions ¢;; on (R?)* by 
Pij(V1,-..,VK) = (vi — vj, bj — by), 


this says that 
Mp C ( )ker Qj ae 


tJ 
1. LEMMA. If b is non-planar, then 
Mp = ( )ker i). 
tJ 


PROOF. By renumbering, we can assume that b;,b2,b3,b,4 are points of b 
that do not lie in a plane. There is clearly no loss of generality in assuming that 
b,; = 0 [as reflected by the fact that we can replace all b; by b; — b; without 
changing (1)}. ‘Thus our assumption on by, b2, b3, b4 amounts to b2, b3, bg being 
linearly independent. 

Since we can also replace all v; by vj — v; without changing (1), it follows that 


dim( (ker di) = 3+ dim({(0, V2,...,VK) € (ker gis). 
tJ i,j 


Now for v with vy = 0, a first application of (1) gives 
(v;,b;) = (vi — v1, bj — bi) = 0 i = 2,3,4, 
and then a second application gives 
—(vj,bj) =(vj,bi) i, j = 2,3,4. 
So if A: R? > R? is the linear transformation with 
v; = Ab; 1= 2,34, 


then A is skew-adjoint. But the dimension of skew-symmetric 3 x3 matrices 1s 3, 
so the dimension of ( ); ; ker $j; 1s at most 6, which is the dimension of Mp. % 


Rigid Bodies 181 


By the way, it shouldn’t be too surprising that the mechanics of this proof 
involved skew-adjoint transformations, since they arc the derivatives of orthog- 
onal ones (compare page 186); given the transformation A of the proof, the 
isometries e‘4 would produce the given infinitesimal virtual displacement v. 


The principle of virtual work. If we use ( , ) for the usual inner product on 
(R?)*, then equation (*) on page 179 can be written in the simple form 


(v, F) = 0; 


in other words, F is perpendicular to the tangent space Mp. As we noted in 
Chapter 3, the inner product of force and distance 1s generally called work, so 
this sum is also called the “(virtual) infinitesmal work” done by the forces F 
during the (virtual) infinitesimal displacement v. Our little calculation that 
(v,F) = 0if b is in rigid equilibrium under F is often referred to by physicists 
as a proof of the “principle of virtual work’. In reality, however, when physicists 
use the principle of virtual work they almost always assume implicitly that it 
includes the converse: 


2. PROPOSITION (THE PRINCIPLE OF VIRTUAL WORK). The non- 
planar collection of points b is in rigid equilibrium under the forces F if and 
only if the virtual infinitesimal work (v,F) equals 0 for all virtual infinitesimal 
displacements v of b. 


PROOF. We have to prove the converse part, that if (v, F) = 0 for all v, then b 
is in rigid equilibrium under F. If we consider the linear function ® on (R?)* 


defined by 
O(v1,...,VK) = Y "(ves Fe), 


k 
then our hypothesis says that ® vanishes on Mp, and thus by our lemma, 


® vanishes on ( \ker gi;, dij (V1,---, VK) = (vi — vj, bj; — b;). 
tJ 
A simple result about vector spaces (see Problem | for a refresher) then states 
that there exist constants A;; with 


P=) Ai dij. 
i,j 
In other words, 


Y "(ves Fr) = So Ais (vi —v;,b; —b;) all vi,...,VK € (R3)*, 
k i,j 


182 Chapter 5 


Choosing all v; to be 0 except for the one vector v;, we thus obtain 


(v,Fr) = Yo Ar (vi, by — by) + D> Air (—wi, bi — by) 
: : 


l 


coe S > Any (vi, bi —b;) 5 > Aji vi, by — b;) 
j 7 
| 


=(v, Ay + Abr — by), 
J 


and since this is true for arbitrary vectors vj in R*, we conclude that 


F) = 9 (ayy + Aj1)(b; — by). 
J 


So we can define 
Bip = —Qij + Aj1)(b1 — by) 


to obtain the required forces. ¢% 


As an extremely simple example, consider the situation shown below, where 
the upward force on bz balances the two downward forces at points b; and bs, 
which are at different distances a and b from bg, with the magnitudes of these 
forces inversely proportional to those distances. Of course, this is merely a 





schematic figure, since it 1s linear, and we really have to assume that there are 
other points around, as in our previous examples. 


Rigid Bodies 183 


It is easy to see that this collection of points is in rigid equilibrium under these 
forces: 


(1) For an infinitesimal displacement given by a vector z pointing in the 
vertical direction, the virtual infinitesimal work is 0 because the upward 
force is the negative of the sum of the two downward forces. 


(2) For an infinitesimal displacement given by a vector z pointing in the 
horizontal direction, the virtual infinitesimal work is 0 because each in- 
dividual component 1s 0. 


(3) For an infinitesimal displacement generated by a rotation around bg (..e., 
around an axis through bz perpendicular to the plane of the diagram), 
the vectors v; and v3 will be in opposite vertical directions, with lengths 
proportional to the distances a and b. Consequently, the virtual infini- 
tesimal work, involving vectors with length znversely proportional to these 
distances, will be 0. (One can check directly that this is just as true for an 
infinitesimal displacement generated by a rotation around either b; or 
b3, but that isn’t necessary, since the set of virtual infinitesimal displace- 
ments that stay in the plane of the diagram has dimension 3.) 


(4) For infinitesimal displacements given by a vector perpendicular to the 
plane of the figure, or by a rotation through axes perpendicular to our 
first rotation, the virtual infinitesimal work also works out to be 0; or we 
can just simplify matters by restricting our attention to the 2-dimensional 
situation to begin with. 


Notice that this provides a fairly good schematic representation of a lever, 
which of course requires not only a rigid body, but also a fulcrum, an immovable 
point. In practice, this “immobility” is provided in a complicated way by the 





connections between the fulcrum and the earth, but it seems reasonable simply 
to regard this connection as a mechanism that automatically supplies the proper 
upward force to the fulcrum when the downward forces are applied at the ends 
of the lever. 

Naturally, a more realistic picture would use a much large number of points, 
forming a 3-dimensional object. But in any case, our analysis shows, especially 


184 Chapter 5 


when we think of the lever as bending slightly, as on page 177, that it is the 
internal forces of the lever that make the weights balance; in short, all the 
“extra force” that one obtains by pushing at a large distance from the fulcrum 
is supplied by the lever itself, in its effort to preserve rigidity (together with the 
force that the earth supplies on the fulcrum, to keep it from moving downward). 

Of course, for a truly realistic picture, we would need further information 
to determine how the forces within the lever actually arise: the internal forces 
guaranteed by the principle of virtual work certainly won’t be unique, any more 
than they were in the unrealistic case mentioned on page 178. In fact, once we 
have 4 points b;, b2,b3,b4 not on a plane, all ¢;; can be expressed as linear 


combinations of the ¢;; fori, 7 = 1,...4, which means that the 4;; in our proof 
are by no means unique, and thus the F;; aren’t either. 


d’Alembert’s principle. Although we have so far investigated only a rigid body 
in equilibrium, our analysis easily extends to the more general situation. 

Instead of looking for a condition for equilibrium, we now seek a criterion for 
a rigid motion ¢ = (c1,...,cx) of b = (bi,..., bx) to be consistent with the 
forces F = (Fy,... Fx). ‘The F; should be considered as functions on M x R, 
to encompass the general situation mentioned at the beginning of Chapter 3, 
where the forces may depend not only on time, but also on the particular rigid 
motion that the body has undergone at any particular time. 

We now need “internal” forces Fi;(¢t) = —F;;(¢) with F;;(¢) a multiple of 
ci(t) — cj(t) so that 


mci (t) = Fi(e(t),t) + 0, Fi 
Or 


mic; (t) —Fj(c(t),t) = ey Fj; (¢). 


The latter equation, which may be regarded as stating that the body is in rigid 
equilibrium under the forces Fj —mjc;", 1s often called “d’Alembert’s principle” 
and regarded as the fundamental law—so that, as the physicists like to say, 
“dynamics reduces to statics”. But this really becomes useful only when we 
apply the principle of virtual work: ‘The conditions on the F;; imply that 


do Avis mic; (t) — Fi(e(t), t)) = 0 


Rigid Bodies 185 


for all tangent vectors v, and, conversely, the principle of virtual work implics 
that if this condition holds, then the requisite F;;(t) exist. This leads us to the 
following definition: 


“d’Alembert’s Principle”: The rigid motion c is a rigid solution for 
the forces F, or, more colloquially, “c is a possible motion of the rigid 
body b under the forces F”’, if for each f, 


>> (Fi(e(t), t) — mici"(t), vi) = 0 


l 


for all tangent vectors v at Mey). 


If we agree to let me denote (m,ci,...,mxKcK) and similarly for me”, and 
also let ( , ) denote the usual inner product on (R*)*, then we can write 


(F(c(t),t) —me’(t), v) =0 for all v tangent to Mey), 
or, if we are willing to tolerate a little ambiguity in our notation, simply 
(« *) (F —mce", v) =0 for all v tangent to Me. 


Our condition amounts to a system of second order differential equations for 
vector-valued functions: If we choose a local coordinate system x},...,x° on 
the 6-dimensional manifold M, then we only have to verify (**) for v = 0/dx’', 
giving us 6 equations for the vector-valued functions ¢’ = x! oc. 

Since M is basically R? x SO(3), we can restate this much more concretely. 
One 3-dimensional collection of vector fields tangent to M are those of the form 


v; = z for a constant vector z. Condition (**) becomes 
O= > (F) —mjc;",z) 
i 
= (iW, 2) ae (Symici”, z), 
Since this must hold for all z, we must have 


(Frigid) Fiotal = YF =< So mic" = MC", 
l l 


where C is the center of mass, and M = )/, m; is the total mass [recall that F; 
really stands for t +> Fi (c(t), ¢)]. 


186 Chapter 5 


We also have to consider the vector fields generated by rotations. As we saw 
in our discussion of the cross-product, these are of the form v; = c; x n. Thus, 
condition (**) becomes 


0 Y (Fi — mici’,ci X 9) 
i 
= \ (Fi, ci xn) — So miler”, ci x 7) 
i i 
= S\(ci x Fi.) — S > miei x ci", 9). 
i i 
Since this must hold for all 7, we must have, recalling (L’) on page 83, 
(Trigid) t= : x F; = S > mici xc" = L’. 
i i 


Condition (Frigiq) simply says that the rigid body must move in such a way 
that the momentum law 1s satisfied, while condition (T;igig) stmply says that the 
rigid body must move in such a way that the angular momentum law 1s satisfied. 


The inertia tensor. ‘To solve these equations we might begin by writing our rigid 
motion ¢ = (cy,...,cK) of b in the form 


c(t) = B(t)(b;) + w(t) 
for orthogonal B(t). Since BBt = I, we have 
B’B' = —BB"’ 
ee —(B’ B*)*, 
so B’B~'(t) = B’ B*(t) is skew-adjoint, and its matrix can be written as 
0 —w3(t) w(t) 
w3(t) 0 —@,(t) }. 
—wo(t) a(t) 0 
Setting w(t) = (ay (t), w2(t), w3 (t)), we then have 
vi(t) = ci (t) = B'(t)(b;) + w'(t) 
= (B’(t)B*(t)) (Bit) (b:)) + w'() 
= (B'(t)B-*(t))(ci(t)) + w(t) 
= [w(t) x cj (t)] + w’(t). 


Rigid Bodtes 187 


There is a natural choice for B(t) and w(t) in our description of rigid body 
motion: choose w(t) = C(t), where C(t) is the center of mass at time f, so that 
B(t) represents the rotation about the center of mass from its initial position to 
its position at time ¢. We then have 


vi(t) = [w(t) x a] +C’. 


We can now write the angular momentum L of ¢ as 


So mici XxVj= Y  mici[(@ xc) +C’] 
= YS [mici x (@ X ci)| + (Somer) eC! 


= ) [mic x (@ x c))] + (MC x C’), M =)°, mi. 


Comparing this with the formula on page 84, we see that the quantity 
Y > mic; x (w x Cj) 
l 


is the same as the “rotational angular momentum’, that 1s, the angular momen- 
tum of ¢ around its center of mass. 

In Chapter 9 we look at specific examples of the equations (Fyigig) and (Trigia), 
but for now we only want to consider some basic aspects of the general problem. 
To simphfy matters, we can ignore the motion of the center of mass, and just 
look for a solution of the form c;(t) = B(t)(b;), essentially describing how 
the body rotates about the center of mass; the more general case just involves 
writing longer equations, without changing the main point we want to make. 
In fact, we will simply assume that some point in our body, not necessarily the 
center of mass, is fixed, and consider it to be the origin of our coordinate system. 

Thus, we are looking for @ so that (Trigiq) holds when we have 


Cc’ = V; =—@WXCj. 
We have 


c;” = (w' x c;) + (@ Xx ¢;’) 


= (w’ x c;) + (w X (@ X c)), 


so equation (Tyigig) becomes 


t= S > mici x (w'’ XC) + So mici x (@ X (@ X c;)). 
i i 


188 Chapter 5 


In terms of the linear function I, : R? — R? defined by 


Ie(m) = Domici XX ci), with Ie(w) = L 
l 


we can write this as 


(x) I.(w’) =Tt — > mici xX (@ x (@ X ¢i)), 


where the right side depends on c € M and w. The only thing we need to 
check is that we can always solve this for some @’ as a function of c and @, 
thereby obtaining a system of first order equations for w’, and thus a system of 
second order equations for the elements of B. In other words, we need to know 
that the linear transformation I, is an isomorphism for all c € M. 

Since we only have to consider c € M of the form c; = P(b;) for some 
orthogonal P, we have 


I-() = > mi P(b;) x (y x P(b;)) 


= PO, m;b; X (P-'(n) x bi)) 


= (PIpP~')(n). 


‘Thus we only have to check that I = Ip is an isomorphism, where 


I(¢) = )_ mibi x (@xbi)) for @ E R?. 


Now for any w € R? we have 


(1G), #) = )_(rmibs x ( x bi), W) 
= > mld x bi, v x bi) 
= ee x bi, @ x bj) 
= (W).4). 


Thus I is self-adjoint, and consequently has an orthonormal basis of eigenvec- 
tors. Since 


(T(¢), ) = Dimld x bil, 


the corresponding eigenvalues are all : 0, and in fact they are all > 0 because 
we are assuming that b is non-planar, and thus at least one |@ x b;| > 0. Since I 
always has positive eigenvalues, it is always an isomorphism, so we can indeed 
always solve equation (*) for @’. 


Rigid Bodies 189 


The map I = Ip, satisfying 
(I) I(@) = L, 


is called the inertia tensor of b (with respect to the fixed point). ‘The directions 
of its eigenvalues are called the principal axes of inertia, the corresponding 
eigenvalues are called the principal moments of inertia, and the inertia ellipsoid 
about the fixed point is the surface consisting of all vectors w with (I(w), W) = 1 
(so the semimajor axes of the inertia ellipsoid are the reciprocals of the square 
roots of the principal moments of inertia). As our rigid body moves under the 
rotations B(t), the inertia tensor for B(t)(b) is just the composition B(t) oI o 
B~'(t); the principal moments of inertia remain the same for all positions of 
the rigid body under the motion, and the whole inertia ellipsoid, including the 
principal axes of inertia, are transformed by the B(t¢). 

Aside from the values of Fiotaj and t, the principal moments of inertia are 
the only other data entering into our equations, so, in a sense, the whole motion 
of the rigid body b depends only on them. In particular, for motion under no 
external forces, we obtain exactly the same equations for two rigid bodies of 
arbitrary shape, provided only that they have the same principle moments of 
inertia. 


Calculating the inertia tensor. ‘[o write down the matrix of I with respect to the 
standard basis (€;,€2,€3) we temporarily adopt the notation b; = (xj, yj, Z;). 
Using the identity! 


w X (u X Vv) = (w, v)u — (w,u)V 
we can write 
I(@) = )— mj(b;|?@ — (bi,@)bi), 
i 
which gives 
I(e,) = )— m; (|b; |?e1 — (bj, e1)bi) 


= Sm; (\bi|7e1 MOG VGeZD) 
i 

= S- mi(yi? + z;7, Xi Yi, —xiZi), 
i 


with similar results for e2 and e3. ‘Thus, the matrix of I with respect to the 


| Proof: Since w x (u x v) is perpendicular to u x vy, it is a linear combination of u 
and v, and the appropriate coefficients are easy to determine using the usual identities 
for X. 


190 Chapter 5 


standard basis (€;, €2, €3) 18 


Y-, mi(yi27 + 277) — 0, mix; — Diy MiXiZi 
=|} —Yimiyixi LymiGi?+2i*) 9 —-Li miyizi 
— )0; miZi x; — )0, MiZi Yi >; mi (xi? + yi?) 


The same result obviously holds for any orthonormal basis (v1, V2, V3) if we set 
x; = (bj,V1), Vi = (bi, V2), and z; = (bj, v3). 

The diagonal terms of the matrix of I with respect to an orthonormal basis 
are the quantities that were classically called the “moments of inertia” of b 
about the axes; the off-diagonal terms are sometimes called the “products of 
inertia”. In other words, the moment of inertia J4 of b about an axis A is 


2 
i). mri, 


where 7; is the distance from b; to A. If our orthonormal coordinate system 
happens to point along the principal axes, then these moments of inertia are 
the principal moments of inertia. 

Note that if the three principal axes all have the same moment of inertia /, 
(see Problems 6 and 8 for examples), then all axes are principal axes, and we 
have L = I(w) = J. In this case, the equation t = L’ simply becomes 


(T symmetric) = Io’, 


in exact analogy with F = mv’. 

As we might expect, for a rigid body b a special role is played by the moments 
of inertia about the axes that pass through the center of mass C = (xc, yc, Zc). 
More generally, there is a simple relationship between the matrix £ and the 
matrix 2° of the inertia tensor of b in a parallel coordinate system whose origin 
is C. Let 


Xj = Xi + XC, yi =)Vitye, Zi = 2; +2Zc, 
so that (x;, yj, Z;) are the coordinates of the rigid body in this system, and let 
M = )°,;m;. When we write & in terms of the x;, yj, 2;, any cross term like 


>=; Mii yc vanishes, since 


Mii = 2; Mii — Yo) = 2; Miyi — Myc, 


Rigid Bodtes 191 


which is 0 by definition of yc. ‘Thus, we obtain simply 


yo? +207 —xC Yc —XCZC 
R=N+M-| -yexce xc*+2c? = —-yczZC 
=ZCXC6 =2C VC xc? + yc? 


= 0’ + Qc, 


where &c is the matrix of the inertia tensor of the single body C with mass M 
around the origin of our original coordinate system. 
In particular, we have 


3. PROPOSITION (THE PARALLEL AXIS THEOREM OR STEINER’S 
THEOREM). Ifthe point P is at distance d from the center of mass C of b, 
the moment of inertia of b about any axis through P is Md? plus the moment 
of inertia about the parallel axis through C. 


Rotation about an axis. Moments of inertia play an important role when we 
consider a rigid body whose motion is a rotation about an axis. For a unit 





vector u pointing along this axis, decompose each cj as 
(1) C=cat (cj, u)u 


where ¢; 1s in the plane perpendicular to u, so that |c;| 1s the distance r; from c; 
to the axis. ‘he tangent vector c;’ is in the same plane as c; and perpendicular 
to it, and if @(¢) is the angle through which the body has rotated at time f, then 
the length of c;’ 1s 7; 9’. So 
CG x G;’ = 1:70! -U, 

and it follows easily that 

(ep cy i) Hie; Kc? a) = ri? 0", 
and thus 

(c; x c7"”,u) = (ce; x ¢;',u)! = 7770". 


So equation (Tyigid) gives the u component of T as 


(Taxis) (tT, u) = IA : Q”. 


192 Chapter 5 


Equation (Taxis) holds even in the more general case where the motion of a 
rigid body is the result of combining rotation about an axis A with a motion of 


By o® 


this axis parallel to itself. In fact, this just changes the c; to 
cj taucty, 


for some functions a and v, with v always perpendicular to u. It is then easy 
to check that 


((c; tau +v) x (c;)” + a@”u + B”v),u) = ((c; x c;”), U), 


because each of the other terms in the expansion is 0. 

If a more “physically intuitive” argument is preferred, we can use the same 
sort of reasoning that we used in analyzing a rocket on page 33: At any partic- 
ular time fg we work in the inertial system that 1s moving with the same velocity 
as the axis at time fo to derive our equation, which involves only the second 
derivative 0” (to) of @ at time to, and consequently also holds in our inertial 
system. 

If there are no external forces on our rotating body, then we have 6” = 0, so 
that the body rotates with constant angular velocity 6‘ = a. Since we now have 


cj) = aux cj, 


we get 
c;’ = au x c;’ 
= a7u x (u X ¢;) 
= a7[(u,c;)u — cj], 
so that 


c; x cj” = —a* (u, c;)(u X cj). 


Rigid Bodies 193 


Since we are assuming there are no external forces, equation (T,igig) then gives 
0 =) mi(ci,u)(u x ci). 
i 


Since 


I(u) = So mici x (U Xx c;) 


l 
=) milei\’?u — Ym (ci, uci, 


this shows that 
u x I(u) = 0, 


so I(u) is a multiple of u, and u must be an eigenvector of I. In the general 
case, a rigid body has just three axes around which it can rotate without external 
forces, and then the angular velocity must be constant. 

The block shown below can rotate about the three axes of symmetry. You 


might naively expect that if it were provided with the right initial push it could 
also rotate about any other axis that passes through the center of mass of the 
block, as illustrated in the right hand part of the figure, but a little thought 
should be able to convince you otherwise (note that the angular momentum 
vector won't be constant). 


Kinetic energy. Although the inertia tensor arose naturally in our investigation 
of the equations for rigid body motion, physics books usually introduce it in a 
completely different way, involving an expression for the total kinetic energy T 
of a rigid body. Using the equations for v; on page 186, with w’ the velocity of 
the center of mass, we have 

Vi=ewxc +w, 


so, letting M = )°, m; be the total mass of the rigid body, we obtain 


T = 1S milyi|? = +M|\w'|? +4) milo x ¢;|* +> > mi(w’, @ XC). 
; i 


194 Chapter 5 


The third term can be written as 
(w’, w x 3°; mici) = (w’, @ x M -C), 


where C is the center of mass. So if we choose a (usually non-inertial) coordinate 
system with the center of mass as the origin, this term vanishes, and we obtain 


T= +M\w'|? + 5 milo x ¢; |? 
i 

= $M\w'? +35 (Ic; @),@), 
I 


breaking up the total kinetic energy into a “translational” part and a “rotational” 
part: 
T = T transi + Trot. 


Since w’ is simply the velocity of the center of mass in our original inertial 
system, Ttrans} 18 Just the usual kinetic energy of the center of mass, while the 
term Tyot is the extra kinetic energy due to rotation about the center of mass, 
or simply the kinetic energy when the center of mass if fixed. So 


(T) D1 ee = (I(@), @) = ((@1,@2, @3) : Q, (@1, 2, 3)). 


Once we’ve introduced kinetic energy, it is naturally tempting to consider 
what happens when our rigid body is moving in a conservative force field. Recall 
that for a particle c moving in a force field F with 


F=-(55 OV =) 


ax!" Ax2” ax3 


we have 
Aly (ts)? — dmv)? = J (w(e),F(e(e))) 


= —V(c(t1)) + Vic(to)). 


So if ( , ) denotes, as before, the usual inner product on (R*)*, and ec = 
(cj,...,¢K), then adding the above equations for the various c; gives, for the 
total kinetic energy T, the expression 


ty 


©) i= tGy= | (v(t), F(e(t))) dt 


to 


= —(-; Vc; (t1)) + oe Vici (to)). 


Rigid Bodies 195 


This equation holds for particles c1,...,cK moving independently, but what 
we are interested in 1s the motion of ¢ as a rigid body, which we think of as the 
motions of the c; under the forces 


F; =F; + - Fi; 
J 
for suitable “internal” forces F;;. ‘Then we have 
ty = 
Tin) To) = | (v@), Fee) at, 
lo 


where 


(v(t), Fi(e())) = (VO), Fie) + DVO), Fi). 
Ze 
But the (easy direction of) the principle of virtual work says that 


Y (v(t), Fis (e(t))) = 0; 


Z. 


as the physicists like to say, the total work done by the internal forces of a rigid 
body is always 0. Consequently, equation (C) holds even for the motion of ¢ as 
a rigid body: 


T(t1) — T (to) = (©; V(ci(ta)) + (0; V(ci (to). 


For a rigid body moving under the influence of gravity near the earth’s surface, 
each V(c;(t)) 1s just m;h;(t) where h;(t) is the height of cj; at time ¢, and it 1s 
easy to see that if hc(t) 1s the height of the center of mass of the rigid body, 
and M is its total mass, then 


DW i Vi (t)) = O°, mi)hc(t) =M -hc(t), 


where hc (t) is the height of the center of mass of the rigid body, and M 1s its 
total mass. So we obtain, finally, 


T(t1) — T(to) = M - [hc (to) — hc (ti). 


Continuous bodies. Mathematically, it is straightforward to generalize the pre- 
vious considerations to a continuous rigid body B with density p: The total 
mass M is given by 


M =| o= | ox.y.2)dxdydz, 


B B 


196 Chapter 5 


using x,y,z for the standard coordinate functions on R?, and the center of 
mass is the vector given by 


1 
C=_ | Se eae 
B 


ie., C is the point of R° with coordinates 1/M times 


[{x-p(x,y,z)dxdydz, fy-p(x,y,z)dxdydz, {z-p(x,y,z) dx dy dz. 
B B B 


However, there is one point that might be emphasized to avoid confusion. 
We often consider forces that are supposed to be acting on a single point of a 
rigid body. If we think in terms of a continuous body, we then have a finite force 


acting on a single point, which ought to have mass 0. It helps to think back to 
our view of a rigid body as a finite collection of particles, where our analysis 
shows that the total effect of all the internal forces should be that each force 
may be considered as acting on the center of mass. For continuous bodies this 
may be translated into the “principle” that a force exerted anywhere will have 
the same effect as one exerted at the center of mass. 

Continuing with our consideration of continuous bodies, we set the inertia 
tensor I of B to be the linear transformation whose matrix & with respect to 
the standard basis (€1, €2, €3) 1s given by [here p denotes p(x, y, z)] 


f p-(y? +27) dx dy dz —fp-xy dx dydz —fp-xz dxdydz 
B B B 
—f p-yx dxdydz { p-(x? +27) dx dy dz —f{ p-yz dx dydz 
B B B 
—fp-zx dxdydz —fp-zy dx dydz f p-(x? + y*) dx dy dz 
B B B 


Once we have the inertia tensor, it 1s irrelevant whether we are considering a 
continuous body or a discrete set of points: everything becomes a set of equa- 
tions for the w. The rotational kinetic energy is given by equation (T), and 
the diagonal terms of & are naturally what we call the moments of inertia for 
continuous bodies; it 1s left as an easy exercise for the reader to check that the 
parallel axis theorem still holds. 


Rigid Bodies 197 


Elementary examples. Some aspects of rigid body motion can be analyzed 
without actually solving any equations, simply by using the angular momentum 
law. In Chapter 3 we mentioned the standard elementary illustration of the 
special case of conservation of angular momentum, involving some one seated 
on a rotating stool. An illustration of the more general angular momentum 
law is provided when the person on the stool holds the ends of an axle with a 
heavy wheel rapidly rotating on it, and tries to turn the axle, either clockwise or 
counter clockwise; the rather non-intuitive result is that the axle 1s fairly hard to 





turn, but the effort causes the stool to start spinning. Unlike the first example, 
in which spin cannot be obtained starting from rest—though a non-zero initial 
spin can be modified—this second example truly involves a rigid body. ‘To be 
sure, we have both a rotating wheel and a stationary axle, but that is merely a 
convenience, and one could imagine a single object consisting of the wheel and 
the axle rigidly connected and rotating together (though this would be awfully 
hard on the hands of the person trying to hold it). 

Without examining the detailed motion of the spinning wheel, we can sce 
that it has a large angular momentum L parallel to the axle, while the twisting 





motion of the person on the stool adds a small angular momentum L, in the 
horizontal plane. The resultant, dashed, arrow is still in the same horizontal 
plane, but rotated from the axis; hence the axis of the rotating wheel needs to 
rotate in order for the wheel to have this new angular momentum. 

It’s also nice to know that we can give an “elementary” argument, by con- 
sidering the velocity vectors at various points of the wheel, together with the 


198 Chapter 5 


additional velocities added by the twisting motion—indicated by solid arrows 
with white heads—which are oppositely directed at the top and bottom of the 





wheel, and 0 at the sides. ‘The dashed arrows show the resultant velocities, and 
obviously the axis of the wheel must rotate in order for it to have these velocities. 

The figure below shows the same situation except that now the forces on the 
axle are exerted in the horizontal plane, L; points downward, and the resultant 
dashed arrow is in a vertical plane, pointing downward. If this is the view of 





L 
o Ly 
“e* 


a bicycle rider (riding toward the plane of the picture), turning the wheels by 
means of the handle-bars to which it is attached, the rider soon intuitively learns 
to lean to the left as the bicycle is thus steered to the left. 

Admittedly, there’s something a little fishy about all these descriptions. How 
can the person sitting on the stool suddenly acquire a horizontal rotation from a 
force directed upwards? How can the completely horizontal force used to turn 
a bicycle to the left end up producing a motion of the bicycle in the perpendic- 
ular plane (a question that many a first-time bicycler may well wonder, though 
perhaps not in those terms). Obviously, the motion must be somewhat more 
complicated than what we have described. 

We defer this question to Chapter 9, which gives a more detailed investigation 
of the equations for rigid body motion; though the situation 1s fairly complicated, 
it involves the straightforward part of our subject, mathematics, and we have 
not yet finished dealing with the tricky part, elementary physics. 


The Strong Form of the Third Law 199 


ADDENDUM 5A 
THE STRONG FORM OF THE THIRD LAW 


In Chapter 1 we noted that the third law is often accepted rather uncritically, 
presumably under some mistaken application of the notion of symmetry. In 
the case of the strong form of the third law, which we’ve made the basis for 
our analysis of rigid bodies, the symmetry argument seems more relevant: if we 
assume that the laws of physics don’t distinguish any direction from any other, 
what other direction could the force between two bodies have except the line 
between them? Stull, we might feel impelled to inquire whether the strong form 
of the third law is consistent with experiment. And the answer to this question 
turns out to be No, although it’s a fairly complicated No. 


In Problem 1-24, we introduced the Lorentz force law 
F = q(v x B) 


for a moving particle with charge g and velocity v in a magnetic field B. If 
our magnetic field is produced by a magnet, rather than a solenoid, we would 
seem to have a force between the moving charged particle and the magnet that 
doesn’t lie along the line between them. Of course, a magnet isn’t a particle 
(unless some one actually discovers a magnetic monopole!), but this example 
might still give us pause. 

We should begin by noting that, on the face of it, the Lorentz force law 
F = qg(v x B) can't be true, since it would mean that F would turn out to be 
different for an observer in another inertial system where the velocity v of the 
particle is different! In fact, the proper statement of the law 1s 


F=q(E+v xB), 


where E is the “electric field”; the apparent disparity between the results for 
different inertial systems 1s accounted for by the fact that a moving magnetic field 
produces an electric field (as in an electrical generator), and moving charged 
particles produce a magnetic field (as in an electromagnet). 

In particular, two moving charged particles each produce a magnetic field, 
which affects the other particle according to the Lorentz law, and this can have 
very strange consequences, as shown for the two charged particles in the figure 
below. The magnetic field produced by a moving charged particle is 0 along 


vi 
pP\ ¢—— fe 


V2 


200 Chapter 5. Addendum 5A 


its line of motion, so particle pz is affected only by the charge on p;. But p; 
not only has the equal but opposite force because of the charge of p2, but also 
has an additional force from the magnetic field produced by p2. ‘Thus the total 
force on Pj is not only not in the same direction as the force on pz, it 1s actually 
bigger! We won’t try to say much more about this here, because the complete 
analysis of this situation involves both the principles of electromagnetism and of 
relativity theory. We should mention, however, that the symmetry argument we 
were tempted to use on page 25 breaks down precisely because the directions 
of the velocities of the particles destroys the symmetry of the situation. 

Of course, we hope that such effects, involving interactions between the elec- 
trons and protons that make up our rigid body, will average out to zero, and 
when we analyze rigid bodies in terms of point masses we are obviously thinking 
more in terms of molecules, which presumably ought to be good representatives 
of point masses without these added weird features. But that is an only a vague 
approximation, or possibly just a vague hope, not to mention that even at the 
level of molecules we really should be describing things in terms of quantum 
mechanics. So our treatment of rigid bodies certainly involves a great deal of 
idealization and simplification (gracefully finessing, along the way, the whole 
question of just how one distinguishes a solid from other forms of matter in 
terms of molecules). 


There is a school of thought that dispenses with all these problems, essentially 
retaining the original mental picture of matter as continuous, totally abjuring 
the strong form of the third law, and simply regarding equation (Tyigig) as a basic 
fact about rigid bodies, verified by experiment (gracefully finessing, along the 
way, the question of what constitutes a rigid body). ‘The only slight problem 
with this approach is that the answer to the question raised in our Prologue, 
just why the lever works as it does, then becomes simply “Because it does”. 


Take your pick. De gustzbus non est disputandum. 


A lively, not to say cantankerous, discussion of this question may be found in 
Truesdell [1], in the essay “V. Whence the Law of Moment of Momentum?”, 
which may be regarded as an equal and opposite force to the viewpoint adopted 
in this chapter; this essay also discusses how the general notion of conservation 
of angular momentum gradually evolved, as briefly alluded to on page 85, from 
Newton’s corollary to his Proposition about areas swept out under a central 
force. 


Rigid Bodies 201 


PROBLEMS 


1. Let f: V — R be a linear function on a (possibly infinite-dimensional) 
vector space V. 

(a) If V1,U2 € V, then f(vi)v2 = f(v2)v1 € ker f. 

(b) We can write V = ker f © W, where W is 1-dimensional. 

(c) If g: V — Risa linear function with ker f C ker g, then g = Af for some 
AER. 

(d) More generally, if g, fi,..., fe: V — R and (), ker ff C kerg, then g = 
>=, Ai fi for some A; € R. 


2. Let the Jacobian of f: R” — R* have rank k on f~'(0), so that M = 
f (0) is a submanifold of R” of dimension n —k. Let g: M > R be differ- 
entiable, and suppose that g has a maximum at p € M. 


(a) Mp = ae, ker df’, where df': R", > R. 
(b) If X, € M, then dg(X,) = 0. Hint: Xp, = c'(0) for some curve c in M. 
(c) Use Problem | to conclude that there are Aj,..., Ax with 


k 
Dea > Ay Forg Spieain 


i=] 


3. Consider a rigid body of uniform density, and a finite number of forces at 
various points of the body. Older books in mechanics consider when two such 


sets of forces are “equivalent” in the sense that they have the same effect on the 
rigid body. ‘This amounts to saying that their sums must be equal, and their 
total torques about a given point must be equal. 


(a) A couple is a pair of forces P and P’ = —P at two different points. Show 
that any set of forces is equivalent to a set consisting of a single force F, together 
with one couple (P, —P). 


. 
. 
N 
x 
nN 
x 
. 
. 
Ny 


202 Chapter 5 


(b) Suppose we are considering only a 2-dimensional situation, so that all our 
forces lie in a plane. Show that the torque of a couple depends only on the 
distance d between the lines of the forces P and P’, and that we can conse- 
quently choose P to be collinear with F. Conclude that any set of forces in the 
2-dimensional situation is equivalent either to a single force, or to a couple. 

(c) In the 3-dimensional situation, consider the system of forces consisting of F 
at the point b;, and the couple consisting of the force P at the point b2, and 
P = —P’ at the point bs, and let t be its torque around O. Let t’ be the 
torque around O of the new system consisting of the same forces, but at the 
points b; + bo. Show that 


t =t— (bo XF). 


(Recall Corollary 4 of Chapter 3.) 

A wrench consists of a force F together with a couple (P, —P) with P a multiple 
of F. If F 4 0, we can try to reduce our system to a wrench by choosing bo 
so that 

LS bo x F-= AF 


for some A. Take the inner product with F to find a formula for A, and then 
conclude that any set of forces is equivalent to either a single force, a couple, or 
a wrench. 


4, (a) Consider two particles of masses m; and mz attached to a lever of neg- 
ligible weight, with their centers of mass at distances d; and d2 from the ful- 
crum, and let x(t) be the distance that the first particle falls after trme t. Use 
d’Alembert’s Principle to show that 
my 
" dym, — dym2 ay 
x" = g-——————_.. 1 
dim, +do2m> m_. “2 


(b) Suppose we start with the lever making an angle of @ with the horizontal, 
and let T be the time it takes for the lever to reach a horizontal position, so that 
the first particle has fallen the distance h = d; sin @. We are going to consider 
dz to be fixed, while d; can be varied, by moving the first particle along the 
lever. Letting D = d;/d2, show that T satisfies 

Dm D? D 

T* = constant: D- ee es constant - eee 

Dm, —mp> Dm, —m2 
and conclude that the lever will reach a horizontal position in the smallest 
amount of time when we choose 


1+ /5 
ip ee 





My 2 


Rigid Bodies 203 


5. (a) Prove the “perpendicular axis theorem”: For a 2-dimensional figure in 
the (x, y)-plane, with planar density p (a plane “lamina’’), the moment of inertia 
J, around the z-axis is the sum J, + ly of the moments of inertia around the x- 
and y-axes. (Naturally this applies to any set of orthogonal axes two of which 
lie in the plane of the lamina.) 

(b) For a 3-dimensional object B with density p, we have 


jtht+h= 2 | p(x, y,z)r(x, y,z)* dx dy dz. 
B 





= 6. Check or find the moments of inertia about all three axes for the objects 
shown below, each centered around the origin; all bodies are supposed to be 
homogeneous, with mass M. The first row contains 2-dimensional objects, 
which are considered to have planar densities, as in Problem 5, while the rest are 
3-dimensional. Note that the inertia ellipsoid of an ellipsoid is a quite different 
ellipsoid. 


I 2 2 ft 2 2 I 2 2 
oo sa - M(ro2 — 
5 M(a? +b?) {M@? + 6”) 5 M (roo 14 a 















204 Chapter 5 


7. Consider a homogeneous disc, regarded as a 2-dimensional object with pla- 
nar density, as in the previous problem, which at time ¢ has rotated about the 
axis through its center by the angle @(t). Show that its rotational kinetic en- 
ergy Trot = SI (9’)?, where J is its moment of inertia about that axis. 


8. (a) Ifa rigid body is symmetric with respect to a plane P, then the direction 
perpendicular to P is a principal direction, with the other two directions lying 
in P. (Here the hypothesis means that the reflection R through P takes any 
point of the rigid body to another point of the rigid body with the same density 
at the two points.) 

(b) Any homogeneous body in the form of a platonic solid or an Archimedean 
solid has at least three principal directions, with the same principal moment of 
inertia, and hence all axes through the center of mass are principal directions. 

(c) Ifa rigid body has rotational symmetry around a line L, then L is a principal 
direction, with the other two directions lying in the plane perpendicular to L. 


9. The derivative of the rotational kinetic energy satisfies Trot) = (@,T). 


10. (a) For any rigid body, the sum of any two of the principal moments of 
inertia 1s greater than or equal to the third. 

(b) Given a, 8B, y > 0 with the sum of any two greater than or equal to the third, 
there is a rigid body having @, 6, and y as its principal moments of inertia. 


11. Let Bz be a ball completely contained within the ball By, and let B be the 
difference, B = B, ~ (interior Bz). Find the principle moments of inertia of B. 





12. Consider a rigid body b in a uniform gravitational field, so that the force on 
particle b; of mass m; 1s mju for some unit vector U. 


(a) As the rigid body b falls (not necessarily in the direction u, since it may have 
an initial velocity and rotation), the center of mass C moves the same as if it 
were a single particle moving under this force. ‘Thus, the center of mass is the 
“center of gravity”. 

(b) Assuming that C is actually one of the points in b, show that b 1s in equi- 
librium under the force of gravity together with a force —Mu on C. 

(c) If a gravitational field is not uniform, then there are rigid bodies that do not 
have a “center of gravity” is the sense of part (b). 


CHAPTER 6 
CONSTRAINTS 


he analysis of rigid bodies represents only the first step in dealing with 

elementary physics problems, because those problems seldom involve an 
isolated rigid body in space. We usually have to consider rigid bodies resting 
on a table, or the floor, or hanging from the ceiling; or we have rigid bodics 
interacting with each other, colliding or sliding along one another; or our rigid 
body is restricted by certain “constraints”, which, as we will see, have played a 
role even in the few simple systems that we have already encountered. 


Rigid bodies in contact. In some situations we just have to use “common sense”, 
and we might as well get such considerations out of the way in this section, which 
will consist of various observations, rather than any systematic presentation. 
The simple notion of one body in contact with another has no meaning in 
terms of our theoretical picture: ideal rigid bodies cannot be in contact, but 


BVeessy 
&®8s¢eeoo 


only very close to each other (just like real ones). If we have a rigid body B 
of mass M resting on a table, which we regard as yet another rigid body, but 
with essentially infinite mass, because the rigid table 1s itself resting on the earth, 





then we might as well resort to the usual elementary analysis: ‘The table must 
exert an upward force on B equal in magnitude to the downward force gM that 
gravity exerts on B, since B isn’t moving. And that of course means that B must 
be exerting the force gM on the table; it’s easy to perceive this force directly by 
sticking one’s finger between B and the table. 


QuCH! 






205 


206 Chapter 6 


We might ask, instead, what upward forces are exerted on the table top by the 
table legs to balance the downward force of B. For a 2-dimensional situation, 
thinking of the table top as a long plank supported by two legs, as in (a), the 





(a) (b) 

forces are easily determined from the conditions that the total force and total 
torque on the table top are equal to 0. But if there are three legs, as in (b), there 
will be more than one possible solution; for example, we could simply ignore 
the middle leg. Such “statically indeterminate” problems require some consid- 
erations of the way that “solid” bodies actually bend. ‘The main ideas involved 
in such investigations are presented in Addendum B. Although such interesting 
problems might seem like obvious topics for mechanics courses, nowadays they 
seem to have been relegated to courses in “applied mechanics” or “mechanical 
engineering”. 

Similarly, the analysis of the previous chapter tells us nothing directly about 
the collision of two rigid bodies (a); the only reason that realistic, nearly rigid, 


a OO en 


bodies rebound is because they do compress a bit (b). It would seem reason- 
able, however, to regard our theoretical rigid body as “perfectly elastic” in the 
sense discussed in Chapter 3—the forces between the compressed particles sim- 
ply restore them to their imitial positions, increasing the velocity by exactly the 
same amount that it has been diminished. In that case, we can simply use con- 
servation of kinetic energy to predict the result. Addendum A considers more 
complex extensions of such reasoning. 

For a final example, involving yet another interesting complication, we con- 
sider three bodies in contact: a piece of iron, a long piece of wood, and a mag- 
net. The magnet exerts a force F on the iron, and the iron exerts the force —F 
on the magnet. We generally presume, probably without even formulating the 
thought, that the force F on the iron is “passed through” to the wood, i.e., that 
the iron must be exerting a force F on the wood, just as in the case of a rigid 
body resting on a table. 





Constraints 207 


To analyze this situation, the piece of wood can be thought of as rigid body 
consisting of two particles, w; and wz. The piece of iron can, for the purposes 
of this problem, simply be considered as a rigid body with only a single particle, 
and the same is true of the magnet, since we only care about the fact that 
the magnet exerts a force on the iron (we’re not going to start worrying about 
whether magnetic monopoles exist or not!). 


Now let’s consider the various forces involved. In addition to the forces F 
and —F that the magnet and the iron exert on each other, the iron exerts some 
force G on the particle w;, and thus w; must exert the force —G on the iron. 


F —F 
a einentinahioemae nee ee 
@e@w w2@@ 
—_—_—_  ——— —_ —> 
—G G 


Similarly, w2 and the magnet must be exerting equal but opposite forces on 
each other. It seems natural to assume that these forces are —G and G (after 





F —F 
——_____. 
Oe w20@ 
——_ > — —— 
—G G —G G 


all, why should the magnet interact with the wood any diflerently than the iron 
does’). ‘This means, according to our criterion, that the rigid wooden rod is 
in equilibrium. Of course, that requires that the iron and magnet also have 
velocity 0, since they can’t pass through the wood! Since the total force on 
the iron is F — G, this means that we have F = G, as we would indeed quite 
unconsciously presume. 

Notice that there seems to be no purely logical way to rule out the possibility 
that the forces between wz and the magnet have values different from G and 
—G. If that were the case, however, then our whole apparatus would be con- 
tinually accelerating—easily solving the energy crisis, among other things—a 
phenomena that we don’t actually encounter in the real world. (These remarks 
may be compared with those on page 277 of Chapter 7.) 


208 Chapter 6 


The pendulum. ‘Jo begin our more systematic investigation of constraints, we 
consider the pendulum, already mentioned several times in previous chapters, 
where a string anchored to a pivot point P supposedly constrains a bob—which 
we will simply regard as point mass—to move along a circular arc. In reality, a 
pendulum bob can’t actually move in a perfectly circular arc (even if we could 





have a pivot point P that was really totally immobile). When we initially release 
the pendulum bob, it starts to fall straight down, and the pendulum string only 
exerts sufficient force to counteract this force of gravity after it’s been stretched 
somewhat. Then that new force will bring the bob a bit above the circular arc, 
so that it starts to fall again, etc., etc., etc. 

In other words, the seemingly simple example of a pendulum is really an 
abstraction, presenting the same sorts of problem as a rigid body. In one respect 
it is simpler, however. We can regard the string as a collection of particles lying 
linearly along the string. ‘The bob might be affected by a force on one or more 

eP 


Be) 


meee” 


on the bob in the direction of the string. Thus, besides the force of gravity on 
the bob, the only other force is a “constraint force” always directed along the 
line from the bob to the pivot P. 

This justifies our previous treatments of the pendulum, but it is also interesting 
to analyze the pendulum in a manner analogous to our treatment of rigid bodies; 
to simplify matters we restrict ourselves to a 2-dimensional picture right from 
the start. 


Step J. Analogous to the condition that a set of points is in rigid equilibrium 
under various forces, we may say that our pendulum bob, the single particle a, 
is in equilibrium under the force F, together with the constraint force of the 
string, keeping it at a fixed distance from the pivot, if 





for some (“internal”) force Fy a the direction of a to P. 


Constraints 209 


Step 2. We define a “configuration space” M for our problem. We now have 
a single particle a, rather than a set of particles a1,...,ax, and instead of the 
constraints that the |a; — a;| remain constant, we have the constraint that a 
remain at a fixed distance / from the pivot. So M C R? is simply the circle 
of radius / about the pivot point, and the “virtual infinitesimal displacements”, 
the tangent vectors to possible motions of a under the constraints, lie along the 
tangent line of the circle at a. 


Step 3. According to Step 1, a force F causes a to be in equilibrium under the 
constraints if and only if it points from a to P. This is the same as saying that 
it is perpendicular to the tangent line, 


(F,v) =0 ve Ma 
(the principal of virtual work). 
Step 4. We conclude that the path c of a should satisfy 
(*) (F(c(t)) —me"(t),v) =0 ve Mew) 
(d’Alembert’s principal). 


Step 5. To obtain a differential equation, we want to take a coordinate system 
on M. The natural choice is the angle @ that the line through a point of M 
and the pivot P makes with the vertical position. With the standard abuse of 





notation, we will let @(t) denote @(c(t)), the 6 coordinate of our particle at 
time ¢t. In (*), we take v = 0/00. If / is the length of the string, then 


(c"(t),v) =10"(t). 


If we choose F to be the constant downward force of gravity, with magnitude g, 
then 
(F(c(t)), Vv) = mg sin O(t). 


So (*) gives us the pendulum equation 


(P) “ere = sin 8 = 


210 Chapter 6 


This equation doesn’t involve the internal force F; (tension), which is directed 
along the radius and keeps the bob on the circular arc. Indeed, the whole point 
of this particular analysis was to eliminate F; from consideration—it is simply 
whatever is necessary to keep the particle on the configuration space M. If 
we need to know this force, we note from the figure that the component of 





the gravitational force along the direction of the string has magnitude mg cos 6. 
Avoiding a trap for the unwary, we note that the tension is not simply the 
negative of this component, because the radial acceleration of the pendulum 
bob is not zero, but /6’* (Problem 1-5), so F; has magnitude mg cos @ + ml1@/? 
(we will be able to handle this question differently in Problem 12-3). 


This very special case illustrates a general method for approaching all such 
“constraint” problems. For rigid bodies, we showed that (F,v) = 0 for all tan- 
gent vectors v of the configuration space by differentiating the constraint equa- 
tions, and using the fact that the internal forces between particles in Newton’s 
third law lies along the direction between them. In the case of a pendulum, on 
the other hand, we simply verified the condition (F,v) = 0 explicitly. The same 
situation will hold for all our constraint problems—we will simply assume that 
any internal forces are always perpendicular to the configuration space. In fact, 
a “constraint” more or less means a condition for which this holds (sometimes 
the term “ideal constraint” is used). Thus, as a general rule we have 


d’Alembert’s Principle for Constraints: Ifthe constraints on a system 
confine the system to a configuration space M, and are perpendicular 
to M, then the motions of the system under the external forces F satisfy 


(F —me”, v) =0 for all v tangent to Me. 


In the case of rigid bodies, our formulation of d’Alembert’s principle was more 
precise: we showed, conversely, that if a motion of the rigid body satisfies this 
equation, then the appropriate internal forces exist. In the case of constraints, 
this will usually be more or less automatic, and, as in our pendulum example, 
the requisite constraint forces can usually be found explicitly, unlike the case of 
a rigid body, where the internal constraints aren’t unique. 


Constraints 211 


The compound (physical) pendulum. Since the only purpose of the pendulum 
string was to maintain the distance between the pivot and the bob, we might 
have simply considered these two particles as constituting a rigid body them- 
selves. More generally, we can consider a “compound pendulum’, or “physical 
pendulum”, an arbitrarily shaped pendulum oscillating around an axle A; this 
is essentially a 2-dimensional problem, although we can think of the pendulum 
as a thin plate. 





This problem illustrates a bit more clearly than the simple pendulum problem 
the numerous abstractions that our analysis entails. In actuality, the pendulum is 
constrained to rotate about the axle by various forces all along the circumference 





of the circular hole around the axis. But that’s obviously a bit more complicated 
a situation than we want to consider, and for theoretical purposes it will be 
better to imagine the pendulum not as a continuous body, but as a collection 
of particles, one of which, co, is kept fixed. Thus we are assuming that the 


constraint forces keep the point co at distance 0 from some point P. Of course, 
this theoretical picture is a little weird, since the constraint forces are supposed 
to act along the line between co and P, which doesn’t tell us anything! But our 
theoretical picture is still capable of encompassing this situation, because of one 


212 Chapter 6 


important fact: Even though we can’t specify the direction of this constraint 
force C, we will have (C,vo) = 0 for all virtual infinitesimal displacements 
Vv = (Vo,..., VK) of our pendulum under this constraint, for the simple reason 
that Vo = 0, since the constraint keeps co fixed. 

This means that we can still use d’Alembert’s principle for constraints, where 
we now have the constraint that our collection of particles constitute a rigid 
body, together with the new constraint that the particle co stays fixed. Our 
configuration space is Just a circle once again, although now we don’t think of it 
as a circle of a particular radius, but simply as the collection of angles 6 through 
which our pendulum can rotate. Restricting ourselves to this configuration 
space takes care of the constraint that co stays fixed, and all the other constraints 
have already been analyzed: thus, we just want to apply equation (Taxis) on 
page 191. 

The torque t due to gravity is easily computed: Since the force on particle c; 
is gm;u, where u is the unit downward vector field, we get 


tT=)0,ci X gmju 
= g-(>°; micj) xu 
=9M-C xu, 


where C is the center of mass of the pendulum. This means that t points in 


C 


the direction of the axis A, and has magnitude 
gMisin @, 


where @ is the angle of C with the vertical, and / its distance from the pivot. 
So equation (Taxis) becomes 


gMIl 
A 


Ov 4+ sin 0 = 0. 





Comparing to the pendulum formula (P), we see that our pendulum acts 
precisely like a single bob pendulum whose distance from the pivot is 
[A 
MI 


Constraints 213 


If Ic is the moment of inertia about the center of gravity C, then by the parallel 
axis theorem we have 





TA _ Ic + MI? es Ic/M 
Ml Mi —_ 
Introducing the radius of gyration k by 
Ic = Mk’, 


we find that our pendulum acts precisely like a single bob pendulum whose 
distance from the pivot is 


k2 
l/+—. 
ate 


When our rigid body pendulum consists only of the pivot and a single particle 
at C, we have k = 0, but in any other case the pendulum will have a longer 
period. 

Our analysis of the simple pendulum also shows that the force that our rigid 
body exerts on the pivot—that is, the sum of all the various forces that the 
particles in our rigid body exert on the pivot—is directed along the line from the 
center of mass to the pivot, and has magnitude gM cos @. The decomposition of 
this total force is not uniquely determined, just as the necessary internal forces 
in the pendulum are not uniquely determined, but in practice, of course, most 
of the force exerted on the pivot will come from particles of the rigid body that 
are near to it, even though mathematically we might think of it as a single force 
exerted on the pivot by the center of mass, though this might not even be a 
particle of the rigid pendulum! 


Equilibrium and stability. Before examining other sorts of constraints, we will 
use the physical pendulum as a simple example to discuss an important concept. 
For an ordinary pendulum (a), we can ask if there are any equiltbrium points, 
a position where the pendulum can simply remain motionless. Of course, the 





(b) 


obvious answer is that the equilibrium position is the one where the pendulum 
is simply hanging straight down (b) with velocity 0. Equilibrium positions are of 


214 Chapter 6 


interest because for a real pendulum, gradually slowing down because of friction 
at the pivot and air resistance, and for many other realistic mechanisms, the 
equilibrium positions are those where the mechanisms may eventually come 
to rest. Before settlhng down to these positions, they will often exhibit small 
oscillations, and the equilibrium positions are the ones around which such small 
oscillations can occur. 

If we consider the case of a physical pendulum, so that the “string” is actually a 
thin rigid rod, whose mass we might consider to be negligible, with a bob being 
a particle of mass M, there is another equilibrium point, with the pendulum 


vertically above the pivot, with velocity 0. Naturally, this situation is clearly 
a little different—it’s easy to obtain the first equilibrium position and virtually 
impossible to achieve the second, but we’ll leave that matter hanging for now. 

At the moment, we simply want to recall that the force F on the pendulum 
bob, a downward force of magnitude gM, is given by 


a F=-(5.5) 


for the potential function 
V(x,z) = gM -z, 


and note that the two equilibrium points occur at the maximum and minimum 
of V when restricted to our configuration space. 

In fact, this observation generalizes for any case of d’Alembert’s principal for 
constraints, for a configuration space M C R”, with a force F € TR” having a 
potential function V: RN — R. As noted on page 90, we have (F,v) = —v(V) 
for all tangent vectors v. At a critical point p of V on our configuration space, 
where v(V) = 0 for all tangent vectors v € My, we thus already have 


(F(p),v) =0 for allv « M, 
so the constant curve c(t) = p in M, with c’(0) = c”(0) = 0, definitely satisfies 


(F(p) —me”(0),v) = 0 for all v € My, 


Constraints 215 


and is thus a possible motion. It is also clear, conversely, that if p is an equilib- 
rium point, then p must be a critical point of V. Several examples are given in 
the Problems. 

In our one-dimensional pendulum example, the minimum critical point is 
“stable”: if we start close enough to this position, with velocity close to 0, then 
the pendulum will stay close to the minimum equilibrium point. On the other 
hand, the maximum critical point is not stable; in fact, for any other initial 
position and velocity, no matter how close to this equilibrium position, the pen- 
dulum will eventually move far away. 

In general, decisions about stability are much harder in higher dimensions, 
and we will discuss certain aspects, the basic question of stability of solutions 
of differential equations near a 0 point of the corresponding vector field, in 
Addendum 8C. 


Sliding. The motion of a block sliding down another wedge shaped block is per- 
haps the simplest problem involving the motion of one rigid body with respect 
to another. This is often described as a block sliding down an inclined plane, 


and for the present we are in fact considering the wedge as being immovable, 
rather than an object that can itself slide horizontally along the floor. 

The usual elementary analysis of this problem has already been mentioned 
in Chapter | (page 30): we decompose the force F of gravity on the block 
into a force F2 parallel to the inclined plane, and a force F; perpendicular to 
the inclined plane, and reason that Fz doesn’t act on the inclined plane. In 


La 


Chapter | we simply said that F; is presumed to be the force of the block on 
the inclined plane, so that the inclined plane must exert the force —F, on the 
block. But we really need to use the hypothesis—now explicitly mentioned— 
that the inclined plane is stationery, using the same reasoning as for a block 
resting on a table: F, determines the acceleration of the block in the direction 


216 Chapter 6 


perpendicular to the inclined plane, but that must be 0 (since the wedge is not 
moving and the block slides along it), so the inclined plane must be exerting a 
force of —F, on the block. 

On the other hand, as soon as one tries to think about this in terms of the 
physics of point masses, by imagining the “molecules” in the block and inclined 
plane, the whole argument appears dubious, since the forces between these point 


masses ought to be along the line between them, and thus seldom perpendicular 
to the inclined plane. 

Of course, we’ve neglected the important (implicit or explicit) qualification 
that the block is sliding wzthout friction. And what does that mean? Why, it must 
mean precisely that when we look at the forces between the molecules of the 
block and the inclined plane, the component of such forces in the direction of 
the plane should be ignored. In other words, the inclined plane exerts a force 
—F, on the block because that’s what we’re assuming. 

Although it is reasonable to add this hypothesis in order to produce a sim- 
ple theoretical problem—and mechanics is replete with problems in which we 
ignore friction—it would be nice to have a mental picture that provides some 
correlation with our notions of friction. 

Friction, which we will discuss a bit more thoroughly in Chapter 11, is ac- 
tually an incredibly complicated phenomenon, because it is an expression of 
intermolecular forces. As pointed out in the Feynman [1]; pp. 12-3 to 12-5]: 


The tables that list purported values of [friction] for “steel on steel,” 
“copper on copper,” and the like, are all false ... The friction is never 
due to “copper on copper,” etc., but to the impurities clinging to the 
copper. ... If we try to get absolutely pure copper, if we clean and 
polish the surfaces, outgas the materials in a vacuum, and take every 
conceivable precaution, ... 


then the copper block does not slide more easily along the inclined copper plane, 
in fact it does not slide at all—the two pieces of copper stick together, even if 
the inclined plane is completely vertical, because the copper atoms near the 
surfaces of the two pieces are attracted by the very same forces that keep the 
atoms within the individual pieces together. 


Constraints 217 


Although that involves considerations far outside those of elementary me- 
chanics, it might suggest that we think of our block as sliding along the inclined 
plane on protuberances, like furniture “gliders”, which reduce the contact be- 
tween two wooden objects. Indeed, if we imagine the theoretical case of a body 


Neca 


with a curved surface resting tangentially on the inclined plane, it would seem 
natural to assume in this case that the forces exerted by the two bodies upon 
each other would be perpendicular to their common tangent plane. 

Once we’ve made the leap of faith to frictionless surfaces, we simply use the 
analysis on page 30, finding that the force F2 parallel to the inclined plane has 
magnitude mg sina, so if c(t) is the distance that the block has traveled along 





the inclined plane after time ¢, then c’”” = gsinaq, and our block slides with a 
uniform acceleration that is sina@ times its free fall acceleration. 

When we allow the inclined plane to be a wedge that slides along a horizontal 
plane, also without friction, our problem becomes more involved. ‘The force Fy 
that the block exerts on the wedge can no longer be obtained simply by resolv- 
ing the downward force of gravity F into forces perpendicular and parallel to 
the wedge, because our identification of F; with the perpendicular component 
depended on the wedge being fixed. 

Choosing the unit vector e; = (1,0) parallel to the floor and the unit vector u 
parallel to the slope of the wedge, we let Ae; be the acceleration of the wedge, 
of mass M, along the horizontal plane, while au is the acceleration of the block, 
of mass m, along the wedge, so that au + Ae is the acceleration of the block in 





218 Chapter 6 


our inertial system. Note that in our picture, we actually have A < 0, so that 
the arrow Ae, points in the opposite direction, since the force Fj causes the 
block to slide to the left. 

Breaking up the equation 


(1) —F, + F = m(au + Ae)) 


for the motion of the block into the components that are parallel and perpen- 
dicular to the slope of the wedge gives 


(la) mgsina =ma+mAcosa 


(Ib) [F,| —mg cosa = mAsina. 


The force on the wedge, of mass M, is F; plus the gravitational force down- 
ward, plus whatever upward force the horizontal plane must exert to keep the 
wedge from moving downwards. So A is determined by the horizontal compo- 
nent of F: 


2) —|F,|sina = MA. 


From (lb) and (2) we get 
sin @ COS a 
A = —g | —.~——_ ] , 
2 (= a + =| 


a= gsina — Acosa@. 


and then (la) gives 


Just for fun, we will analyze this problem using configuration spaces. ‘To do 
this, we regard our system as consisting of two “particles”, the block cm and the 
wedge cy. Since the wedge cy always stays on the horizontal axis, we’ll simply 


Se 

eee V 

a Pe 
Cm ~~. 





CM 


consider our problem as occurring in (R7) x R, with ((a, b), x) representing the 
particle Cm at the point (a,b), and the particle cy at x. 

Our configuration space M consists of all ((a,b),x) which represent the 
block cm resting on the wedge cy [1.e., for which we have b = (x — a) cosa]. 


Constraints 219 


A convenient coordinate system on M is provided by the coordinate x € R 
giving the position of cy, together with the distance s of cm from the top of 
the wedge. 

To determine 0/0s, we keep x fixed and vary s, obtaining a curve in M whose 
R* component moves down the slope of the wedge, while its R component is 
fixed, so for the unit vector u in R? parallel to the slope of the wedge, we have 


4 
— = (u,0). 
a. (u, 0) 


On the other hand, if we keep s fixed and vary x, then we obtain a curve 
in M whose R* component moves parallel to the first axis along with its R 
component, so for e; = (1,0) we have 


- = (€;, 1). 
Now if s(t), x(t) are the functions describing the motion of cm,cy, we have 
Cm(t) = s(t)u + x(t)e, € R? 
cu (t) = x(t) ER, 
SO 
Cm! = s"ut xe, € R’ 


CM” = x" ER. 
The external forces F, on Cm and Fy on cy are given by 
Fin = —mge2 
Fy = 0, 
so our condition for a solution is that 
(—mge2 — MCm", V1) + (O— Mcy")- v2 = 0 


for all v = (v1, V2) tangent to M, where ( , ) is the usual inner product in the 
first factor R*, while the inner product in the second factor R is just ordinary 
multiplication. Choosing 


0 0 
v= — = (u,0) and then v= — = (e;, 1) 
Os Ox 
gives the two equations 
0 = (—mge2 —ms"u—mx"e;,u) — Mx” -0 


0 = (-mge2 —ms"u — mx"e;,e€1) — Mx" - 1, 


220 Chapter 6 


which amount to the equations 


0 = mg sina —ms” — mx” cosa 


0=-—-—ms” cosa —mx” —Mx”. 


Solving for x” and s” gives us the same results that we obtained previously, 
when they were called A and a, respectively. ‘The first of the above equations 
is precisely (la), while the second is a combination of (la), (Ib), and (2). 

Although this configuration space method doesn’t explicitly use the force Fj, 
it probably seems more complicated than the elementary method. In Part III we 
will see that Lagrangian mechanics provides a more convenient way of handling 
the problem, but the basic reasoning behind the use of Lagrangian mechanics 
for constraint problems is indicated by this straightforward use of configuration 
spaces, even though virtually all of the problems for this chapter are most easily 
solved by the elementary method. 


Rolling. I do not know whether the wheel first appeared as the invention of 
some Homo sapiens genius whose identity has been obscured by the mists of 
human history, or, in the manner of “2001, A Space Odyssey”, as a gift from an 
advanced intelligence. However, wheels have continued to bedevil the minds of 
people ever since, and even today one can encounter the bemusing “paradox” 
of two wheels rotating on a common axle, supposedly implying that the cir- 
cumference of the smaller circle is the same as that of the larger one (something 
akin to this paradox appears near the beginning of Galileo [2].) One can easily 





dispel any mystery this might present by using an apparatus that allows the two 
wheels to roll on separate tracks, rotating independently on the same axle, so 





that one sees the smaller wheel rotating more than once in the same time that 
the larger wheel rotates exactly once. If one then secures the inner wheel to the 


Constraints 221 


axle on which the outer wheel is attached, say by a screw, and there is sufficient 
friction on the two tracks to prevent the wheels from sliding, the wheels will 
simply not move, or do so in a very jerky manner, because they both can’t roll 
at once. 

The moral of all this seems to be that “rolling” must mean moving in such a 
way that each arc of the wheel traces out a line of the same length—if a wheel 
doesn’t have this property, then it just isn’t “rolling”. But this pronouncement 
is not very useful from the point of view of physics, because it doesn’t explain 
physically “what is going on” (which, in a way, is what bothered Galileo); more 
precisely, it doesn’t provide a way of analyzing a wheel rolling down an inclined 
plane in terms of various forces. 

Physics books that discuss this problem in any detail point out that rolling 
depends on a truly paradoxical fact: a wheel only rolls because of frictional 
forces, ones that “oppose sliding”: a wheel not only displays the characteristics 
of our abstract rigid body, but it also has the strange feature that it is affected by 
a frictional force that is exerted only at the (always changing) contact point of 
the wheel on the inclined plane. ‘To make the picture even more confusing, this 
frictional force essential for rolling won’t affect conservation of energy, because, 
the physics books note, the path followed by any point on the circumference of 
the wheel has velocity 0 at the moment it hits the plane! 

It is indeed well known that a cycloid, the path followed a point on the cir- 
cumference of a wheel, has velocity 0 at the point of contact, but physics books 





seem to regard this fact as intuitively clear, as well as another claim, that the 
motion at any time is essentially just a pure rotation about the contact point, 





a fact that is sometimes invoked in discussing the phenomenon of two wheels 
rotating on a common axle. For those of us not endowed with the requisite 
physical intuition, here is a proof, for the general case of one surface rolling on 
another. 


222 Chapter 6 


1. PROPOSITION. Consider two surfaces M and M in R? that are tangent at 
a point p. Letc be acurve in M, and c acurve in M such that c(0) = p = c(0), 





and such that c’(0) # 0 is a multiple of ¢’(0). For each ¢ let A(t) be the rigid 
motion for which 


(a) A(t)(c(t)) = c(t), 
(b) A(t)(M) is tangent to M at E(t), 
(c) A(t)(c’(t)) points in the same direction as c(t), so that 


A(t)(c’(t)) = a(t)- ¢’(t) for some function a. 


Also, for each point c(t) on the curve c, let yr be the curve that this point 
follows under these rigid motions, 


yc(t) = A(t)(c(t)). 


Then the following are equivalent: 


(1) a(t) = 1 for all ¢, so that the lengths of c and c are the same on any 
time interval [fo, ¢1]. 


(2) Each y,’(t) = 0, so that y, has velocity 0 at the time that it hits M. 


(3) For each t we have 
A'(t)(x) = C(x — ¢(t)) 


for some skew-adjoint C (so that A(t) is, “up to first order’, a rotation 
about the point c(f)). 


PROOF. Write A(t) in the form 


A(t)(x) = B(t)(x) +w(t) x ER’, 


Constraints 223 


for orthogonal B(t). Setting x = c(t) and using A(t)(c(t)) = c(t), we see that 
w(t) = c(t) — B(t)(c(t)), so we can write 


A(t)(x) = B(t)(x) + [e(t) — B@)(c@))I. 
The definition yo(t) = A(t)(c(0)) = A(t)(p) gives 
(a) yo(t) = B(t)(p) + e(t) — B@)(c(t)), 
so that we can also write 


(b) A(t)(x) = B(t)(x) + yo(t) — B@)(p) 
= B(t)(x — p) + volt). 


Since B(O) is the identity, differentiating (a) gives 


yo (0) = B’(0)(p) + €’(0) — B’(0)(p) — c’(0) 
= (0) —c’(0) 
= €'(0) — a(0) - é’(0). 


So a(0) = 1 if and only if yo (0) = 0, and by (b) this is true if and only if 
A’(0)(x) = B’(0)(x — p), where B’(0) = B’(0)B(0)* is skew-adjoint. 

Our hypotheses on A(t) then allow us to use this same argument at any point 
c(t) by considering the reparameterization t +> t + T. % 


For the case of a wheel rolling down an inclined plane, we can provide a 
picture that both reinforces this geometric information and also allows us to see 
“what is going on” physically. We regard our circular wheel as a polygon with 
a very large number of sides, and suppose that initially, in position 1, it is lying 
on the inclined plane along the segment AB. Now, instead of sliding down the 
plane, it rotates about the point B, reaching position 2 when vertex C hits the 





224 Chapter 6 


inclined plane, at C’. Then it rotates around C’, and so forth. (Galileo uses a 
picture of this sort for the case of two wheels rotating on a common axle, and his 
analysis makes for very interesting reading—it is the sort of inspired nonsense 
that only a genius would come up with, or at any rate, get away with.) 

Let’s return to the special case of a wheel rolling down an inclined plane, 
once again assumed to be immovable. Our “wheel” is really supposed to be a 
cylinder, so that it is forced to roll along a straight line, but the 2-dimensional 
cross-section picture provides all the interesting information. 

We consider a wheel of radius R and mass M and uniform density, and let 
6(t) be the angle through which it has rotated after time ¢t. For a unit vector u 





pointing down along the inclined plane, we will let a-u be the acceleration of 
the center of mass, and let — f-u be the frictional force along the inclined plane 
at the contact point of the wheel and the plane. The total force on the wheel is 
the sum of the downward force of gravity, a constraining force F; perpendicular 
to the plane, which keeps the wheel from moving perpendicularly to the plane, 
and the frictional force —/f -u. 

For the acceleration a -u of the center of mass we have 


(1) Ma = Megsina — f, 


since Mgsina is the magnitude of the component of the gravitational force 
along the inclined plane, while the constraining force F; 1s perpendicular to u. 
Note that since our wheel is a rigid body, the force f acts on the center of mass 
(cf. remarks on page 196). 

For the rotational motion about the center of mass we can apply equation 
(Taxis) On page 191 to the axis through the center of mass that is perpendicular 
to the plane of the drawing to get 


(2) 16” = Rf, 


where J is the moment of inertia of the wheel about its center of mass. 


Constraints 225 


Finally, the fact that our wheel is rolling tells us that the distance traveled by 
the center of mass at time f is equal to R- 6(t), which means that 


(3) a=R-6". 


Solving (1)—(3) gives 





a= gsina 7 
1+ 
R?M 
and substituting into (1) then gives 
Fe Mirs I 
= sin @ ———— 
oe ROM +1 


In terms of the radius of gyration k defined (page 213) by 
I = Mk’, 


we have 
2 


va = Mg SINGS pa 


l 

a= gsina Ty 2) Ry’ 
Notice, by the way, that this is always non-negative. If the wheel is rolling up 
the incline plane, because of some initial impulsive force at the bottom, the 
frictional force opposing sliding is still directed upwards. 

The coefficient (Mg sin ak?) /(k? + R7) gives the amount of frictional force 
that the inclined plane must be able to produce in order to prevent sliding. As 
a very general rule, we can say that the frictional force that a body produces on 
an inclined plane is proportional to the normal component of the gravitational 
force, 1Le., 1t equals u- Mgcosa for a constant py, the “coefficient of friction’. 
So to prevent sliding at the angle a@ we need to have 


k2 
u-Megcosa = MAS Ye Re 
or 42 
p= ee ape 


‘Thus we will need a “perfectly rough” surface, with “jz = oo” if we want to 
prevent sliding at any angle. 

If our wheel is simply a homogeneous disc of radius R and mass M, then 
(Problem 5-6) the moment of inertia J is +M R*, with k? = 5, so we have 


bse Dee 
a= Zgsing. 


226 Chapter 6 


Thus our wheel rolls down the incline plane at only 2/3 of the speed that a 
block slides down a frictionless inclined plane. But the difference is not due 
to the “infinite friction” of the inclined plane on the wheel (the frictional force 
does no work), but to the fact that the kinetic energy of the wheel has both 
a translational part Tiransj) and a rotational part Trot (page 194) and we find 
(Problem 8) that the total kinetic energy at the bottom of the plane is precisely 
M gh where fis the initial height of the wheel. 


One aspect of our solution deserves particular notice. When a = 0, so that 
the wheel is simply rolling under no force at all, we have f = 0. This shouldn’t 
be surprising—with no forces acting on it, the wheel presumably moves with 
constant velocity, and a frictional force would change that. On the other hand, 
friction is supposed to be what causes rolling. So in this case it’s the frictional 
force of 0 that is responsible for rolling! One apparently has to accept this as 
an inevitable consequence of our idealizations. 


Even though a rolling wheel on an inclined plane does involve friction, 1t is still 
a natural candidate for treatment by d’Alembert’s principle for constraints: If 
we consider our wheel as a rigid body made up of a large collection of particles, 
and let v be any virtual infinitesimal displacement of the wheel, then the inner 
product (f, vp) of the frictional force f and the velocity v, at the point of contact 
p is always 0, since vp = 0 according to Proposition 1. 

The main difficulty 1s that we have a sort of hybrid between our initial pen- 
dulum bob problem, where we considered a single particle acted upon by a 
constraint force, and the problem of a pendulum as a rigid body, where almost 
all our constraints had already been considered in the analysis of rigid body 
motion. We really need to think of our wheel as representing two different 
“particles” s and @ in R, 

Ss = position of the center of mass 
6 = angle through which wheel has turned, 
having the respective masses 
M = the total mass of the wheel 
I = the moment of inertia of the wheel. 
[Lagrangian mechanics will enable us to handle this more directly]. 


We thus have a problem in R? that is reduced to a problem on a 1-dimensional 
submanifold M C R? by the rolling condition 


O(t) = s(t)/R. 
On M we have the obvious single coordinate s, the distance along the inclined 
plane, and the corresponding tangent vector to M represents the pair 


(1,1/R). 


Constraints 227 


We are looking for a function s(t) such that 
(s(t), O(¢)) = (s(t), 5(2)/R) 
satisfies 
0 = ((-F; — Ms", 0-10"), (1,1/R)) 
= ((-F; — Ms”, 0—Js"/R), (1,1/R)), 
where Fs 1s the component of the force on the center of mass that is parallel to 
the inclined plane. ‘Thus we get 
0= Mgsina— Ms” —(Is"/R)-1/R 
= Mgsina— Ms" —Is"/R?*, 


which gives the same result 


I : | 
s =gsina 


1+ 


R?M 


that we obtained previously by combining three equations. 





Some subsidiary topics. Before continuing with the major considerations of this 
chapter, we tie up some loose ends by considering a couple of subsidiary topics. 


1. Time-dependent constraints. Consider the 2-dimensional problem of a bead 
that can slide without friction along a rigid wire that is rotating about the origin 
in some plane, with no other external forces on the bead. Naturally, we won’t 
worry about the particular details of the forces involved around the hole in the 





bead, but simply consider the bead to be a point mass constrained to lie on the 
wire at all times. 

We can’t choose our configuration space M to be the set of all possible posi- 
tions of the bead, since that is all of R*, rather than a 1-dimensional manifold. 
But we can apply the configuration space method by means of the standard 
trick of introducing time as another variable. In other words, instead of a par- 
ticle in R?, we consider a particle c = (x, y,t) in R? with the constraint that 
(x(t), y(t)) lies on the wire at time ¢ and the additional constraint that t(t) = f. 


228 Chapter 6 


If O(t) is the angle that the wire makes at time f, and r(t) 1s the distance of 
the bead from the origin, then we have 


c(t) = (r(t) cos A(t), r(t) sin O(t), t) 
and we find that 
ce” = [r” — r0’7] - (cos 9, sin 9, 0) + [r’6’ + (r0’)'] - (—sin 6, cos 6,0). 


Our constraints restrict c to lie on a 1-dimensional manifold for which r is a 
coordinate system, and d/dr = (cos 6,sin 6,0). Our one equation 


0 = (-c”, 0/dr) 
then reduces to r” = 6/r. 


2. Hinges. So far, we have been considering almost exclusively only rigid bod- 
ies, or systems like a pendulum that act essentially like one, but we often need 
to consider systems involving rigid pieces that are “hinged” together, if, for ex- 
ample, we want to be able to analyze the motions of living objects, as discussed 
on page 86. 

Naturally, the actual details of such hinges are mind-boggling complex, but 
for theoretical purposes we can simply imagine a “linkage” of three particles 


A A 
YY —\e 
sap oe 


where the distances between the particles A and B and between B and C must 
remain constant, while the angle ZABC can vary freely, with forces applied at 
any of these points. 

In terms of the lengths /; of AB and /2 of BC, our system can thus be 
described by specifying the position (x, y) of A, and the angles 6; and 62 that 
AB and BC make with the perpendicular, so our configuration space is just 
Rx SS", 





If we consider the case where A is fixed, reducing our configuration space to 
S! x S', the position of C is given in terms of the coordinates 6, 62 by 


(7; sin 6; + 42 sin 62, 1; cos 8; + locos 62), 


Constraints 229 


and 
0/00, = (1; cos 6, —l; sin 61) 
0/002 = (15 COS 02, —ly sin 02). 


Geometrically, 0/062 is /2 times the unit vector uz at C that is perpendicular to 
the circle of radius /2 with center B, while 0/00, has length /;, but points along 
a vector in the direction of the vector at B that is perpendicular to the circle of 
radius /; with center A. 

A (x, y) 





When A and B are particles of masses m; and m2 we obtain the “double 
pendulum”, with rather complicated equations of motion. We’re not going to 
derive these equations here, although we will mention a special case in Chap- 
ter 8. For now, we merely want to point out that such “hinges” can be treated 
by the methods at our disposal, and we can theoretically consider the motion 
of even more complicated linkages. 


Ce 


It should also be pointed out that we might consider our problem in a rather 
different way: We begin with two rigid rods, one described by the coordinates 
(x1, yi) and 61, the other by the coordinates (x2, y2) and 62, and then add the 


(x1, 91) 
(x2, ¥2) Xb 
> 


additional constraint that (x1, y1) = (x2, y2). Here we would be imagining a 
pair of additional constraint forces, Fiz = —F21, directed along the line from 
(x1, ¥1) to (x2, y2). Since the configuration space M will involve only points 
with (x1, ¥1) = (X2, y2), tangent vectors to M will involve two equal vectors 
Vi = V2 = W at (Xj, y1) and (x2, y2), and consequently 


(v1, For) + (V2, Fiz) = (w, Foi + Fiz) = 0, 


so that d’Alembert’s principle for constraints will still apply. 


230 Chapter 6 


Holonomic and differential constraints. All the constraint problems examined 
thus far were amenable to a treatment paralleling our treatment of rigid bodies, 
with the constraints in each case restricting our solutions to le in a “configura- 
tion space” M that was a submanifold of the larger space for which the problem 
was originally posed. We have used the obvious principle that if you are looking 
for the solutions of a differential equation on a manifold WN, and you know that 
the solution lies on a submanifold M C WN, then you might as well just consider 
what the equation says on M, thereby obtaining an equation in fewer variables. 

Physicists call such constraints “holonomic”’, and physics texts usually present 
holonomic constraints only in the context of Lagrangian mechanics. As we will 
see in Part IJ, Lagrangian mechanics relies on the same basic principle, but it 
allows us to circumvent the main difficulty encountered in all our examples— 
the task of expressing the tangent vectors 0/0x; for a coordinate system on M in 
terms of the standard coordinates on the enveloping space, and of expressing the 
acceleration c”(t) of each particle in terms of the coordinate functions x; (c(t)) 
on M. Lagrangian mechanics instead provides a systematic way of writing down 
the final equations, without going through such intermediate steps. 

Physics texts also mention all sorts of other constraints, like the constraint 
that particles remain within a given box, or outside of a given sphere. Such 
constraints are expressed by inequalities or more complicated conditions, and 
obviously require special considerations in each individual case. But there is one 
other very important sort of non-holonomic constraint that allows a systematic 
treatment. 

The standard example of this kind of constraint is provided by an upright 
disc rolling on a plane. The possible positions of the disc are determined by the 


plane of the disc 





coordinates (x, y) of the point at which the disc rests on the plane, the angle 6 
that a fixed point on the disc makes with the vertical, and the angle ¢ that the 
plane of the disc makes with the x-axis. 

This example is rather idealized. ‘Io begin with, in order for the disc to 
remain upright, we might imagine that it has a companion disc attached to it by 


Constraints 231 


an axle (for the non-upright disc, see Addendum 9A and Addendum 12A). We 
will want to assume that the axle and the companion disc both have negligible 
weight, and it is also important that the two discs be able to rotate independently 





about this axle, so that, for example, the disc can revolve around a circle, with 
its companion “shadow” disc revolving around a circle of a different radius. 
In addition, although our disc has to have some thickness, we want to imagine 
it to be so small that it actually can roll along a circle—or, indeed, along any 
path—rather than being constrained to roll along a straight line. 

In the simplest case, where there are no external forces, it 1s easy to guess from 
the symmetry of the situation that the disc of mass m and angular momentum L 
will roll with constant speed along a circle of radius m/|L], or along a straight 
line when L = 0. But that doesn’t suggest a general method for solving the 
problem where there are external forces, for example if we tilt the plane, so that 
now the force of gravity is only partially offset by the constraining perpendicular 
force of the plane. 

Unfortunately, we are stymied when we try to use our method of configuration 
spaces to reduce the problem to one in fewer variables. Starting with our disc at 
a point (Xo, yo), we can roll it, as in (a), to a nearby point (x1, y1) along paths 
that all start in the same direction at (xo, yo) but reach (x1, y;) at different 
angles, so that we obtain a whole interval of possible @ valucs. Moreover, we 
can also roll it, as in (b), along paths that all have the same direction at both 
(xo, Yo) and (x1, y1), thereby obtaining a whole interval of possible @ values. 
Thus, the proper configuration space for this problem is a whole neighborhood 
of R* x S! x S!, rather than a lower-dimensional submanifold. 


fo (x1, 1) (x1, 1) 


(xo; Yo) (Xo, Yo) 
(a) (b) 


232 Chapter 6 


This phenomenon is a reflection of a simple fact about the relations be- 
tween the coordinates of our disc moving in the space with coordinate functions 
(x, y,6,¢). Letting x(t), y(t), O(¢), d(t) denote the components of the coordi- 
nates of the disc, the velocity of the center of mass is RO’, where R is the radius 
of the disc, and consequently (refer to the figure on page 230) we have 


x’ = RO’ cosd 


(1) hs Ais 
y = RO sing. 


This means that the tangent vectors of the curve satisfy 


7 0 = dx — Rcos¢ dé 
U) 0=dy—Rsnddo, 
In particular, they therefore satisfy the condition 


dy —tan@dx = 0. 


This determines a 3-dimensional subspace of all tangent vectors at each point, 
but this 3-dimensional distribution isn’t integrable, as we can easily see from the 
standard integrability conditions. For example, to apply the differential form 
version of the Frobenius integrability theorem, we simply note that the 2-form 


d(dy —tan@ dx) = sec* ddd A dx 


isn’t in the ideal generated by dy —tan¢@ dx. Equivalently, we can note that the 
distribution is spanned by the vectors 


een Ree eee 
Bone aes ae a! - Og’ = aa? 


but the bracket 
0 dd >, 0 
es | SS a | ee naw 
Ee an p dy 53 | SEC Pa 
obviously cannot be written as a linear combination of the three vectors Xj, 
xX 2 and X 3- 


Thus, although we have a condition that must be satisfied by tangent vectors 
to a solution curve, we can’t select a 3-dimensional configuration space on which 


Constraints 233 


the solution curves must lie. We can only say the following: 


We must have (F — mc”, v) = 0 
for all v € ker(dx — Rcos¢d6@) 1M ker(dy — Rsing dé). 


Here F is evaluated at (e(t),t) and ec” is evaluated at t, while dx — Rcos¢ dé 
and dy — Rsin gd dé are evaluated at c(t). 

More generally, if the conditions in (1’) are replaced by the vanishing of certain 
l-forms @1,...,@z, then 


We must have (F — me”,v) = 0 


for all ve kerwm, N:-::-Nkera@yz. 
In terms of the linear functional 
O(v) = (F —me”,v) 
this condition says that 
ker ® D ker@, 1-::N keraz. 


We can now appeal to the very same vector space fact (Problem 5-1) that was 
used in the proof of Lemma | of the previous chapter, and conclude, applying 
the argument at each point, that 


@=Ayo, +---+AL@r 


for some functions 4;,...,Az, known as Lagrange multipliers. ‘Vhis leads us to 
the following criterion for solutions: 


d’Alembert’s Principle for Differential Constraints: If the constraints 
on a system require the tangent vector of the motion to lie in the 
subspace ker(@1) 1--- M ker(wz), then there are Lagrange multipli- 
ers A,,...,Az such that the motions of the system under the external 
forces F satisfy 


(F —me”,v) =A\a@1(v) +--+: +AL@z(v) 


for all tangent vectors v at c. 


234 Chapter 6 


We want to apply this to the problem of an upright disc rolling on a plane, 
where we again have the relations 
x’ = RO’ cosd 
y’ = RO’ sing 


0 = dx — Rcos¢dé 

0O=dy—Rsinddé. 
As in the case of the rolling wheel, we are not dealing with a single particle 
c(t); and in the present situation we have to think of the disc as three different 
“particles”, the particle (x, y) with mass M, the particle 6 with mass J, the 


moment of inertia of the disc about the axle, and the particle @ with mass Jg, 
the moment of inertia of the disc about a diameter. We thus have 


(x) ((-Mx",-My"”,—Ig6”, -10”), v) 
=A, (dx — Rceos¢d6)(v) + Ax(dy — Rsingdé)(v) _ for all v. 


Taking v = 0/dx, 0/dy, 0/d¢, and 0/00, this gives us the equations 


(2x) —Mx" =), 

(2y) —My" = 2 

(26) Igo" =0 

(20) 10” =, Rcosd +A2Rsin¢g. 


Differentiating our original constraint equations (1) gives 

x" = RO" cos¢ — RO’d' sing 
y” = RO" sing + RO'¢’ cos¢, 

so substituting (2x) and (2y) into (26) gives 

(3) 10” = —MR|(R0" cos ¢ — RO'¢’ sin ¢) cos ¢ 

+(RO” sind + RO'¢' cos ¢) sing | 
= —MR?*6". 
Thus (J + MR?)6” = 0, and @’ is constant. 


Equation (2¢) shows that ¢° is also constant, and if we substitute the two 
expressions 


O(t) =at+b 
g(t) =ct+d 


Constraints 235 


into (1) and solve, we find that (x, y) moves along a circle of radius Ra/c for 
c #0, or a Straight line if c = 0, in which case ¢ 1s constant. 

Some discussion of the much more complicated case where the disc need not 
be vertical 1s postponed until Addendum 9A, but for now we can apply the same 
analysis to the more interesting case where our vertical disc is rolling down an 
inclined plane with slope a. The only change to (*) is that the term —Mx” 





must be replaced with Mg sina — Mx”. In the set of equations (2), the only 
change is that equation (2x) is replaced by 


gM sina—Mx”" =). 
Proceeding as before, (3) then becomes 
(I + MR7)0"(t) = (MgR sina) cos b(t) 
= (MgRsina)cos(ct + d), 
which, introducing an appropriate constant A, we write simply as 


6” (t) = Acos(ct + da). 


‘The solutions of this equation—whether derived by Lagrangian mechanics or 
by our current method—yield interestingly complex possibilities for the motion. 
In the special case c = 0, the angle ¢@ will be constant. This case is essentially 
just the same as the case of a wheel rolling down an inclined plane: the disc 
rolls down the inclined plane along the straight line that makes a constant angle 


recep 


236 Chapter 6 


For c # 0 we might as well take d = 0, so that @(t) = ct, since this just 
amounts to changing the point from which @ is measured, so there are constants 
B and C with 

6’(t) = Bsin(ct) + C, 


and thus 
i + RB sin(2ct) + RC cos(ct) 
y = +RB(1 — cos(2ct)) + RC sin(ct). 


To see the general shape of the curve along which the wheel rolls, we can 
take c= 1/2, R= 1, and B = 2, so that 


x'(t) =sint + C cost/2 g(t) =t/2 


(*) y(t) = (1 -—cost) + C sint/2 6’(t) = 2sint/24+C. 


First we consider the special case C = 0. We then have, up to additive constants, 


x(t) =1-—cost 
y(t) =t-—sint, 


which is the standard parameterization of a cycloid, with the x and y axes 
reversed. As in the following picture, the disc—white on one side and gray 
on the other—moves along the cycloid, which appears upside-down because x 
increases in the downward direction. ‘To get this behavior we would need to 





tangent line 
at a vertex 


start the disc rolling straight down, but with a bit of spin. After the disc has 
traveled along the cycloid to the next vertex, we have x’(t) = y’(t) = 0, as well 
as #"(t) = 0. The disc then continues along the next arc of the cycloid; note 
that it doesn’t just fall back along the arc already traversed, because it still has 
a non-zero spin ¢’(t) = 1/2. 


Constraints 237 


The figure below shows the solutions to (*), over the fundamental interval of 
length 47, for increasing values of C. The curves become progressively taller 
and less symmetrical, and the tangent lines at the vertex become less vertical, 


C=0 C= .6 C= 8 


C= 2 


so the axis of the disc goes further beyond the vertical. ‘The vertices occur at tf 
with x/(¢) = y’(t) = 0, both equations giving the same condition 


(V) sn - = —— 


and we then have 6’(t) = 0 at these vertices. 
By C = 1.4 the curve is quite unsymmetrical, and by C = 1.7 it crosses 
over itself. On the other hand, since the equation (V) has no solutions at all for 





Cc=1.4 CS] 


|C| > 2, those curves are completely smooth, as in the case C = 2.2 illustrated 
below. ‘The case |C| = 2, the transition between crossing curves and smooth 


2 87 
hace, 


238 Chapter 6 


ones, 1s also smooth, with x’(7) = y’(z) = 0; it looks essentially the same as 
the curve for C = 2.2, though it looks much closer to having a corner at the 
top loop, where it has y” (ar) = 0. 


Finding the constraint forces. Although the Lagrange multipliers A; that occur 
in d’Alembert’s principle for differential constraints may seem to have appeared 
out of the blue, they may be interpreted in terms of the constraint forces C on 
our system S. In fact, consider two systems: 


(a) the system S with constraints C and external forces F, 


(b) the system S with no constraints and external forces F + C. 


The systems (a) and (b) obviously have the same solutions. But the solutions 
to (a) satisfy 


(F —me",v) = A,@1(v) +++: +AL@z(V) for all v, 
while the solutions to (b) satisfy 
(F +C—me",v) =0 for all v; 
subtracting the first equation from the second, we find that we must have 
(C,v) = —(i@1 +++: +AzL@z)(V) for all v. 


By writing out A;@; +--:+Azq@zy in terms of the coordinates, we can then find 
all the components of C. 


For example, in our original problem of the disc rolling on a horizontal plane, 
where we have 


A1@1 +A2@2 = Ay(dx — Rcos¢ dé) + Ax(dy — Rsn¢ dG) 
=), dx +42 dy — (A; Rcosd + A2Rsin d) dé 


we find that the components C, and Cy are given by 


©, (C3300) i 
Cy = (C, 9/dy) = —Ad. 


Thus, the constraint forces can be found in terms of A; and A2, which we 
can determine from (2x) and (2y) once we’ve solved explicitly for x(¢) and 
y(t). The x and y components together, the vector (x”(¢), y"(t)), represents 
the constraint force on our “particle” (x(t), y(¢)), and thus the frictional force 


Constraints 239 


exerted by the plane to keep the center of mass in its circular orbit. Since the 
center of mass moves in a circle with constant angular velocity, (x(t), y’’(t)) 1s 
always perpendicular to the velocity vector v = (x’(t), y’(t)) of the center of 
mass, as we would expect. 


In the case of holonomic constraints, we didn’t need to use the Lagrange 
multipliers A;, but we can use them, if we want to obtain the constraint forces. 
For example, consider the simple rolling wheel problem on page 226. Now we 
will simply use the coordinates s and @ and the relation 


s'(t) = RO'(t) 


between their derivatives. We then have the following condition for all v: 


,) 0 
(Mg sina — Ms")— — 10” —, v) =A(ds — Rdé)(v). 
Os 00 


Taking v = 0/ds and then v = 0/00 we get 


(a) Mgsna—Ms" =i 
(b) HO" eR, 
sO 

I 


Megsina—Ms" = ae 


and differentiating the constraint s’ = RO’ gives s” = RO”, so this becomes 


I 
Mgsina— Ms" = —s", 





R2 
with the same solution 
s” = gsina 
1¢— 
R*M 
as before. Substituting back into (a) then gives 
1 = Mes I 
= sin & —————_., 
oe REM +1 


which agrees with the formula for the frictional force f on page 225. 

Finally, we should point out that we can easily formulate and use a “mixed” 
version of d’Alembert’s principle, where some of the constraints restrict our 
system to lie in a configuration space M, while other constraints restrict the 
tangent vector of the system to be in the kernels of various 1-forms. 


240 Chapter 6 


The rolling sphere. After all these exertions, the question of a rolling sphere 
(mathematically, a ball), even on a level plane, might seem quite intimidating. 
If v is the velocity of the center of the sphere of radius R, while vc is the velocity 
of the contact point C, it is easily scen that 


V 


[Pal] 


where z is the unit vector pointing upwards. So condition (2) of Proposition | 
on page 222 for rolling, vc = 0, gives 


Vc =V+ RzX@, 


v = —RZX®W 


(which also follows immediately from the fact the sphere is instantaneously ro- 
tating about C). We also have (page 190) the equation 
t=I1o’. 

The upwards force at the contact point C that balances the weight of the ball 
makes no contribution to the torsion t around the center, so t depends only 
on the frictional force F at the pomt of contact in the plane. Now from the 
considerations on page 226, we would suspect that F = 0, so that t = 0, and 
hence @ is constant. But if @ is constant, then the first equation shows that v 
is constant. In other words, it appears that the sphere should simply roll along 
a straight line with uniform speed. 

To prove this, we substitute the above two equations into 


F = my’ 
t =—-RzxF 
to deduce that 
Tw’ = —mRz x (—Rz x w’) = (= > oi’ is horizontal) 


= —mR*w’, 
so that w" = 0. Problem 14 explains that this result doesn’t mean quite what 
you might expect. 

The paradoxical contrast between the disc rolling on a plane and the more 
restricted motion of a sphere rolling on a plane arises because the condition 
that a sphere is rolling is stronger than the condition that a disc is rolling, since 
we have to account for the motion of the contact point in two directions rather 
than just one: when the upright disc moves in a circle, there must be a cen- 
tripetal force directed toward the center, but this comes from a frictional force 
perpendicular to the plane of the disc, and thus irrelevant to the question of 
whether the disc rolls. 


Constraints 241 


Give a physics student enough rope problems ... Elementary physics texts seem 
to delight in presenting problems involving ropes (or strings, or some other type 
of “filament”), like the classic example of two weights on opposite sides of a 





pulley, which is often referred to in mechanics texts by the rather mysterious 
name of “The Atwood machine”. 

The sumplest way of dealing with the Atwood machine is to regard the pulley 
merely as a device that allows the oppositely directed forces to be produced 
by the force of gravity acting in the same direction: in other words, we treat 


gm 





my my 


the problem as if it involved oppositely directed forces on two weights that are 
attached to a long, weightless, rigid rod. The implicitly assumed unstretchability 
of the rope, or the rigidity of the corresponding rod, merely insures that the two 
weights stay a constant distance apart. 

Since we will soon be analyzing the forces that ropes entail in some detail, 
for now let’s simply solve this presumably equivalent problem like any other 
constraint problem. Our system is determined by the positions x1, x2 of the 
weights along the line, and since x2 — x, 1s constant, we have a 1-dimensional 
configuration space M, and the single equation 


(gm, —m1x1") + (-gm2 — m2x2") = 0; 


together with x2” = x", this gives 
m, — m2 


m, +m 


In particular, if we start with equal masses M, and then add a very small mass m 
to one side, the acceleration 


m 
2M +m 


ad 
X11 =8 


242 Chapter 6 


will have a very small value, allowing us to measure it much more accurately 
than we could directly measure the acceleration g of a body falling freely. ‘The 
first mechanism of this sort was constructed by the Rev. George Atwood (1746— 
1807), a tutor at Trinity College, Gambridge. Atwood, not having ball-bearings, 
employed a rather ingenious mechanism to reduce friction; for some very nice 
pictures go to the web site physics. kenyon.edu/EarlyApparatus and click 
on Mechanics, and then on Atwood’s Machine. 

For a more detailed analysis of ropes and such, we certainly don’t want to 
worry about the details of how ropes are attached to other objects, either by 





the friction of knots, or some sort of glue. Instead, we'll simply attach a rope to 


po 0-8-6 
eeoeoe 
eeeesd 
eeoeose 
eeoeod 


another object by making an endpoint of the rope a particle on the surface of 
that object. 

Our rope, or other filament, may be regarded as the limiting case of a very 
large collection of very small rigid rods that are linked together by hinges. When 
a force F is applied to the free end of a filament, the filament will become “taut”, 
arranging itself along a line in direction of F, which 1s how we will always picture 
it. If a force F is applied to one end of a filament of mass m attached to an 


M 





F, 


object of mass M, then the filament and attached object have an acceleration a 
satisfying 
F=(M4my)a. 


If F; 1s the force that the last particle of the filament exerts on the object of 
mass M, then we also have (compare Problem 1-4) 


F, = Ma, 


Constraints 243 


and consequently 
= os F 

M+m 
Thus, the force transmitted to the object of mass M is less than our original 
force F. Physicists like to consider the idealized case of a “massless” filament, 
with m = 0, which can be regarded as the limiting case when m is very small 
compared to M. In this special case of an idealized massless filament we will 
have F; = F. 

Consider a filament of mass m with a force F at one end and a force F’ at 





Fy 


F’ a F 


<——- © © © 0 0 0 0 06060080806 o — > 


the other, and thus an acceleration a given by 
(a) F + F’ = ma. 


Although the internal forces of a real rope extend in all sorts of directions, for 
our idealized filament it’s convenient to assume that each particle exerts a force 
only on the particles right next to it. If A and B are two adjacent particles, 





B A 
F \s. . 
——> 
Fp, 


and mj 1s the mass of the part of the filament to the left of A, then the force 
Fea that A exerts on B satisfies 


(b) Fp, + F’ = mya. 
Thus, 
(c) Fe, = mya—F’ 


=—(F+F)-F' by) 


des (— - 1) F’. 
mn m 


So the force varies in magnitude from |F|, at the far end, where m; = m, to |F’| 
at the other end, where m; = 0. The magnitude of the force at any point of 
the filament is called its tenszon T at that point. 


244 Chapter 6 


Notice that T is a number, not a vector: at any point A of the filament, it is 
the magnitude of the force Fg, that A exerts on the adjacent particle B, and 
thus the magnitude of the equal but opposite force F4gz that B exerts on A. 
This tension could be determined by inserting a spring into our filament at A, 
and measuring how much the spring is stretched. 


tit 
 secesetllililiizess+eeeeres 


In a situation like a tightened violin string we have F’ = —F, with a non- 
accelerating string, and the tension T has the constant value |F|. Of course, the 
tension can be increased only by stretching the violin string, so our idealized 
filament represents the usual sort of strange hybrid, where we pretend to be 
working with rigid bodies, or unstretchable filaments, even though the required 
internal forces can only arise from minute amounts of stretchability. 

When the forces at the ends do not balance, so that our string 1s accelerating, 
the tension will not be constant. However, the idealized case of a massless 
filament will make sense when at least one end of our filament 1s attached to an 
object which has a non-negligible mass. If we enlarge our previous picture, of 


B A 
ei 
—> 
Fp, 


a filament with forces at either end, to include objects of masses M and M’ at 
the ends, then equation (c) 1s replaced by a rather more complicated relation, 


m, + M’' m,+M’ 
F234 = —————— —_—_—___— —-l1]F, 
m+M4+M’ m+M+M’' 
but when we take m, and thus mj, to be very small, all Fg4 have the constant 
value 
M’ M’ M’ M 
—___. Ff} + { ——--——_ — 1 ] F’ = ——___ F — —___ Ff’ 
M+ M’ M+M’ M+M’ M+M’ 


so that the tension will again be constant along the filament. 


Constraints 245 


Now let us consider a filament that is wrapped partly around a fixed object, 
with forces F and F’ acting tangentially at the ends (a). Of course, in practice we 


AD, \ 
(a) / 


(b) 


would normally have some additional filament on each side (b), but we already 
know how to handle those situations. We can show that for the case of a massless 
filament with at least one end attached to an object of non-negligible mass, F 
and F’ will again have the same magnitudes, even though their directions may 
vanish; the analysis is just a more complicated version of the previous one, 
noting that any normal forces that arise in the analysis are balanced out by the 
normal forces that the object exerts on the various particles. 

That is the basis for the usual analysis of the Atwood machine, if instead of 
thinking of a cord passing over a wheel, we imagine that the cord is sliding fric- 
tionlessly over a fixed rod (in practice we would have to grease the rod and/or 
the cord pretty heavily to obtain anything close to the theoretical ideal of fric- 
tionless motion). ‘he object of mass mz has an acceleration a downward given 





by 
mg —T = m2a, 
while the object of mass m, has an acceleration —a given by 
mig —T =m,(—a), 
= g(m2 —m})) 
a SS 


Mm, +M2 


as before, and we also find the tension T = 22m ,m2/(m, +™mp?). 


246 Chapter 6 


Of course, in practice, we use a pulley in which the cords pass over a wheel, 
with nice low-friction ball-bearings, rather than a heavily greased cord sliding 
over a fixed rod. But the fixed rod picture actually gives the best representation 
of the usual elementary analysis of the Atwood machine, because this analysis 
conveniently assumes implicitly that the wheel over which the cord passes has 
negligible mass. 

The case of an Atwood machine with a “massive” wheel (1.e., a wheel with 
non-negligible mass) is seldom discussed in physics textbooks, and those that 





do discuss it engage in the usual maddeningly nonchalant assumption that any 
fool would know how to analyze it. In reality, however, the Atwood machine 
with a massive wheel represents something quite unlike anything else we have 
considered. 

We are still going to ignore losses due to friction, but this just means that we 
are going to assume that the wheel rotates on its axis without friction. Unlike the 
case of a fixed rod, on which the cord slides without friction, we now want the 
motion of the cord to cause the wheel to turn. But that would seem to require 
the friction of the cord on the wheel, which we would also like to ignore! We 
appear to have something like the paradox of a rolling wheel, but with even 
greater complications: our wheel touches the cord on which it “rolls” along a 
whole stretch of cord, and the velocity of the points of the wheel are not 0 at 
these points. 

A good representation of this apparently contradictory theoretical picture 1s 
given by an actual mechanism, the chain and sprocket wheel, known to everyone 





chain and sprocket wheel belt drive 


who rides a bicycle. Industrially, a chain and sprocket wheel is used when a belt 
drive would be inadequate—precisely because the belt would have too much 


slippage. 


Constraints 247 


As an idealized version, we can imagine that as the individual particles in 
our filament make contact with the wheel, they attach themselves to particles 


T 1b) 





in the wheel, dutifully detaching themselves when it is time to leave the path of 
the wheel. These particles Just go along for the ride while they are in contact 
with the wheel, and the usual internal forces between them temporarily vanish. 
Consequently, at each moment we basically have two different cords attached 
to the wheel at opposite points, each of which can have its own tension. 





For a wheel of radius r, the torque on the wheel is r72 — rT}, so if @ is the 
angular acceleration of the wheel, we have r72 — rT2 = aI, where I is the 
moment of inertia of the wheel around its center, and thus 


12-1, = (I /r)a. 
If a is the acceleration of the cord, we also have 


§Mn2 — 715 = M2a 


T, —gm, =mya. 


Adding these three equations, we obtain 


I 
gm2z2—gm, =ma+m2at oo 


248 Chapter 6 


But ra is the magnitude of the acceleration of a point on the circumference of 
the wheel, so we must have a = ra, and thus 


gm2— gm, =mya+ma + ae 


giving us 


g(m2— my) 
(x) Gig Tey 
pane re) 

Since the total force F on our system is g(mz — mj), equation (*) may be 
regarded as saying that the “effective total mass” of our system is my + m2 + 
(I/r?), so that the wheel has an “effective mass” of J/r*, rather than J. This 
might have been expected, since motion through distance s for the end masses 
corresponds to a rotation of s/r radians for the wheel, and the torque equation 
rF =/]-0" =1-s"/r can be written as F = (1/r*)s”. It also suggests an easy 
way to find the solution (Problem 16), if we don’t need to find the tensions. 


Note, by the way, that even after we “straighten out” the problem in the same 
way that we originally treated the Atwood machine, we need to think in terms 





of this chain and sprocket wheel picture if our rope causes a massive wheel to 
rotate as it is being pulled. A similar analysis is necessary if we have an object 
constrained to move along a track so that it causes a massive wheel to rotate as 
it falls. 





The Bouncing SuperBall 249 


ADDENDUM 6A 
THE BOUNCING SUPERBALL 


In Chapter 3 we considered a ball bouncing off a flat surface to justify the 
term “perfectly elastic” for a collision preserving momentum and kinetic energy. 
The arguments at the beginning of this chapter would lead us to say that if a 
rigid ball is thrown straight at a flat surface (with effectively infinite mass), then 
the reasonable presumption is that it will simply bounce back with the same 
velocity. 

For the case where the ball is thrown at the floor at an angle, it seems rea- 
sonable to argue that when the ball hits the floor, the horizontal component H 





of its velocity is unchanged, while the vertical component V causes the wall 
to produce a force giving an opposite horizontal component of —V, so that 
the ball bounces back according to old rule “angle of incidence equals angle of 
reflection”. 

But what happens when the ball is spinning? The common pink ball used 
in many children’s games seems to follow the same rule, at least approximately, 
while a more sturdy tennis ball definitely shows some deviance from that rule. 
The SuperBall (see Problem 3-26) represents a fair approximation to both per- 
fect elasticity and rigidity, and it bounces in a quite remarkable way. 

Let ho and vo be the signed magnitudes of the initial components of the 
velocity of the ball (positive ho represents motion to the right, positive vo motion 
downwards). We will take the simple case where the ball is spinning about the 
axis perpendicular to the plane of the figure, and we will let wo be its angular 
velocity about that axis. Similarly, h,, v1, and @, will be the values after the 
ball bounces back. 

We will assume, as before, that vj = —vo. If J is the moment of inertia of 
the ball about its center, then conservation of kinetic energy gives 


+ 1a? + imho” = ioe + $mh)’, 
which can be written as 
(a) mh, — ho)(hy + ho) = —I(@1 — @0)(@1 + @o). 


It also seems reasonable to assume that the floor is “perfectly rough”, with 
conservation of angular momentum at the point of contact of the collision. This 


250 Chapter 6. Addendum 6A 


gives the equation 
(b) ma(h, —ho) = —I(@, — 0). 


We will ignore the solution h; = ho, @1 = wo, since this would imply that 
there were no frictional forces from the floor to change the spin. So we can 
divide (b) into (a), to obtain hy + ho = a(@; + @o), and thus 


(c) h; —aw, = —(ho — aw). 
[Since h — aw is the horizontal velocity at the point of contact, this says that it 
is exactly reversed at the bounce. ] 

, 


Solving (b) and (c) for hy and @, in terms of ho and wo, and using J = 2ma 
we end up with 


3 4 
hy = =ho + 7 Woda 


(*) h 
3 10 ‘0 
WM, = — 70 + 7. 
a 
For ho = 0 we obtain 
4 3 
hy = =Wwod, | = —50, 


so if the ball is thrown vertically downwards with spin it bounces up at an angle, 
with a (smaller) reversed spin! 


7a C 
| / 
! 
/ 
rik 
ly, 
\y 


UII 


Similarly, if we the ball throw at an angle, but without spin, it will end up with 
spin. As a matter of fact, there is a SuperBall trick that involves bouncing the 





ball under a table, without spin, and having it return, with a spin. If we start 
with wo = 0, then for the first bounce (*) gives 


(B1) 


The Bouncing SuperBall 251 


For the second bounce, downwards, the angular velocities need to be calculated 
in the opposite direction, so we have to change the final + signs in each equation 
of (*) to minus signs, so that we get 


3 4 31 
h> = shy = 7Wida h> = —Z5ho 
B or 
7 a a 49 a 


Similarly, for the third bounce we obtain, finally, 


hz = —333ho 
ey) 130 
wW3z = =a aalt0) 


and the ball returns in practically the same direction, with slightly slower speed, 
but with the same total kinetic energy. 

An analysis of this sort first appeared in Garwin [1], where some arguments, 
which I don’t understand, are given to justify the various assumptions that we 
have made. Other justifications, which I also don’t understand, may be found 
in Barger and Olsson [1]. 


252 Chapter 6 


ADDENDUM 6B 
STATICALLY INDETERMINATE PROBLEMS 


All statically indeterminate problems require some information about the 
elastic properties of materials, but in some cases the information is elementary 
and easy to apply. 

For example, consider three filaments, of the same material, arranged as in (a), 
so that ABD is an isosceles triangle, and CD is its altitude. A weight W is hung 





(a) (b) (c) 


from the end, as in (b), so that the side filaments exert a force of magnitude P 
along their directions, while the middle filament exerts a force of magnitude Q. 
These forces actually come about because the filaments stretch slightly as in (c). 

We assume that the filaments obey Hooke’s law: If the filament is fixed at one 
end and the force F pulls on the other end, then 


|F| =A. for some constant A, 


Al 
ea 
where / is the length of the unstretched filament, and A/ is the increase in length 
(this holds only for a certain interval of A/ values). 

We could use geometry to find the length AD’ — AD of the left filament in 
terms of the change in length 6 = DD’, express both P and Q in terms of 6 
and the constant A, and then use 2P + Q = W to find 4, leading to rather 
messy formulas for P and Q, in terms of 4. But it is much easier (Problem 22) 
to see what the limiting values are for ¢ — 0, and these limiting values, which 
don’t involve A at all, give a reasonable answer for most filaments, like a piece 
of wire, where J is very large, and € very small. 


Statically Indeterminate Problems 253 


A more involved problem is encountered when we consider a plank resting 
on three identical supports. If the plank were perfectly rigid, the supports would 





all have to be compressed by the same amount, and thus all have to provide 
an upward force exactly 1/3 of the weight of the plank. But we are interested 
instead in analyzing the effect of the plank’s bending, which results in different 
compressions of the supports, and thus different upward forces. In fact, we 
aren’t interested in the extremely tiny compression of the supports at all, only 
in the upward forces that they will have to provide to balance the bent plank. 
For simplicity we consider “knife edge” supports, which touch the plank along 





a line, appearing as a single point in our 2-dimensional section. 
Hooke’s law also holds for a plank or rod, except that it is stated a bit dif- 
ferently, because we want to take into account the cross-section A, which was 


a area A 


pene 


essentially assumed to be 0 for the case of a filament. So we consider the ratios 


FI | Al 
stress 0 = “A? Strain € = ee, 


and write Hook’s law in the form 
o = Ee 


for a constant E, the modulus of elasticity. Of course, A actually changes a bit 
when the force F is applied, but the change is so minute that it is disregarded. 
It should also be noted that the modulus of elasticity for stretching might not 
be the same as that for compression (concrete is supposed to be an example), 
but we will be only consider the case where they are the same. 

For steel, E ~ 29 x 10°psi (pounds per square inch), while for molybdenum, 
E = 49 x 10®psi. For wood E varies between ~ .6 x 10®psi and ~ 1.7 x 10®psi, 
depending on the type and grade of wood, the direction of the load, etc. 


254 Chapter 6. Addendum 6B 


Now consider a long plank supported by various knife edge supports. The 
plank is actually going to sag a small amount, so that viewing it head on we 
see something like the picture shown below. ‘To the left of the dotted line, there 
is a compressing stress along the top and a stretching stress along the bottom, 
while to the right of the dotted line just the opposite is true. The stress is thus 0 





at some intermediate surface, the neutral plane, whose profile, shown as a heavy 
line in the figure, is the graph of some function f. The figure also shows a 
cross-section of the “fibre” through (x, f(x)), that is, the surface into which a 
vertical section of the plank is deformed. 

The figure below is a greatly enlarged view of a small portion of the figure 
near the point (x, f(x)), bounded by two fibres. While the heavy line is the 





Yo 


graph of f/f, the cross-section of the neutral plane, the dotted line indicates the 
cross-section of a surface where the stress has the constant value y. ‘The whole 
region is filled up by such surfaces having values in some interval (yo, y1). 

Let the curvature of the graph of f at the point (x, f(x)) be k(x), where 
(see page 62) we have 


F(x) 
re Ae) al 


K(x) = 


Near the point (x, f(x)), the graph is very close to a segment of a circle sub- 


Statically Indeterminate Problems 259 


tending an angle @ with radius R = 1/(—«), where the minus sign is necessary 
because f” < 0 at this point. So the solid and dotted lines have lengths A and 1’ 





given, to first order, by 


A = Ro 
M=(R + yd. 


Consequently, the strain ¢ along the surface indicated by the dotted line is given 


y M-A_ ye _ iy 

a = —YykK, 
A Rod R 

and the stress along this surface 1s 





o =&E=—ykE., 


It is easy to check that for points where f” > 0 we get exactly the same formula. 

For the fibre A through the point (x, f(x)), let t(x) be the total torque on A, 
with respect to the point (x, f(x)), from all the external forces to the left of the 
point (gravity acting down on the portion of the board to the left, together with 
the upward force of any supports to the left). Since A isn’t rotating, we must 
have the following, where 5 is the width of the plank: 


r(x)= | o 


YI 
=b]} o(x,y)-ydy 
Yo 
y 
= —Ex(x) by? dy 
Yo 
= —EITxK(x), 
where 
YI 
f= by? dy 
Yo 
is, by definition, the moment of inertia of A, which we will assume can be taken 
to be a constant. 


256 Chapter 6. Addendum 6B 


This gives us the equation t(x) = —E/xk(x). Finally, since f’ is usually going 
to be extremely small, we simply throw away the f’(x) term in the expression 
for x(x), leading to the Euler-Bernoulht equation for thin plank bending, 


(*) —EIf"(x) = (x). 


As a simple application of the Euler-Bernoulli equation, consider a plank of 
length a resting on three knife edge supports, two at the ends, and one in the 





middle. The plank, of weight W, is assumed to have uniform density w = W/a; 
the outside supports each exert an upward force of P and the middle support 
exerts an upward force of Q, with 2P + Q = W. For convenience, we choose 
the position of the x-axis so that our function f is 0 at 0, a/2, and a. 

For 0 < x <a/2 we have 


T(x) = —Px + Swx’, 


where the first term is the moment of the upward force P at distance x from 
our point, and the second term comes from the uniformly distributed force of w 
along the plank of length x to the left of our point. ‘Thus 


EI f"(x) = Px —4wx? 


2 
and ; : 
| a 
EI f'(x) = a re — am Gre 


There is another equation for a/2 < x < a, involving another constant C2, but 
in this case we can use symmetry to dispense with the second expression. Since 
we clearly have f’(a/2) = 0, we can immediately solve for C1, to get 
Px? wx? war Pa? 
bse es 
2 6 48 8 


and since f(0) = 0 this gives 
Px? wx" wae Pa? 
El = — - —— —- — —— 
Oo ae a t (a -)a 
Finally, using f(a/2) = 0, and remembering that aw = W, this gives P = = W. 
So each end provides an upward force of 4 W, while the middle support bears 


most of the weight, providing an upward force of is W. 


Statically Indeterminate Problems 257 


By the way, from symmetry, on the intervals 0 < x < a/2 anda/2 <x <a 
we obviously have 


Swx? is P—wx 
; , => Elf a= 
PIG) = WG =X) —-P+w(a-x). 


Px - 
EIf"(x) = 





Thus at a/2 there is a jump discontinuity in EI f’” = —t’ of 
2(P-—)=2P-W=-9, 


due to the upward force Q concentrated at a/2. 

This simple example, as well as several others, comes from the delightfully 
old-fashioned book Synge and Griffith [1], which does not eschew “engineer- 
ing” type problems. ‘There are, of course, many subtleties that have been over- 
looked in this brief description, and modern books often don’t even mention 
the equation explicitly, because it is basically a simplification (Hooke’s law itself 
is basically just a simplification). Novices can easily be misled, as I discovered 
when I tried a a simple “home-lab” experiment, using three scales and a brass 
strip. Though it gives a result vaguely close to the theoretical one when the bass 





strip is about 3 feet long, for shorter lengths the results are dramatically wrong, 
with the reading on the middle scale dower than the readings at the ends. It 
took me a long time to realize that this is because the results are reliable only 
for thin beams, ones whose thickness is small compared to their length. 

Although you might not even find the name Bernoulli in a modern mechan- 
ical engineering book, you probably will find Euler’s name, though usually in 
connection with his theory of column buckling, which is a sort of vertical version 
of plank bending. For an interesting “home-lab” experiment that one can do 
concerning Euler buckling, see Casey [1]. 

More sophisticated analyses of all these matters are to be found in studies of 
elasticity. For mathematicians interested in exploring this subject, I suspect that 
a good place to start would be Marsden and Hughes [1]. 


258 Chapter 6 


PROBLEMS 


1. Consider the iron, wood, magnet combination on page 206, except that the 
iron and wood are in contact, but separated from the magnet, towards which 
they are accelerating. If F is the force that the magnet exerts on the iron, of 





mass M, what is the force that the iron is exerting on the wood, of mass m? If 
instead the wood is in contact with the magnet, of mass M’, what force is the 
magnet exerting on the wood? What happens at the moment that the wood 
comes in contact with both the iron and the magnet? 


2.! For the three positions A, B, C of a pendulum bob released from position A, 
draw the four vectors shown in the figure on page 210. The vectors needn’t be 
drawn precisely to scale, but the relative sizes should be clearly indicated. 


lt x JaMSUY 
as ~ ee ' 
‘y 


3. Measuring g with a pendulum bob on a string is very inexact because the 
string length is difficult to measure very precisely. A physical pendulum avoids 
this problem, but introduces a new problem, the difficulty of finding the center 
of mass, or equivalently the radius of gyration k. The Kater pendulum cleverly 
circumvents this problem. In essence, the pendulum consists of a rod on which 





fixed bar 
center | ly 
of mass 


there are two knife edges pointing in opposite directions, either of which can be 
placed on a fixed bar and used as the pivot point of a pendulum. 


CE. agm.cat/recerca-divulgacio/pendulum-TPT. pdf. 


Constraints 259 


(a) If J; and /2 are the distances from the center of mass to the two knife edges, 
then for small oscillations the periods J; and T> are given by 
rt a2 (EH) 
1 gl; . 
(b) If the positions of the knife edges are movable, and adjusted so that 7; = 
T> = T, then k? =1,l, (provided that 1, # Iz). So 


i+ 
g = 40? ( a) = anL/79, 


where L = 1, +/2 can be measured very accurately without having to measure /; 
and /2. In practice, it is much better to keep the delicate knife edges stationary, 
and add another adjustable weight. 








(c) We cannot expect to get 7) = 72 exactly, no matter how obsessively we mea- 
sure, adjust weights, remeasure, ... , so we need to know how the inaccuracies 
affect the measurement for g. Show that 


4n? AT? —l2T2? oT? +72? 11? — T?? 


g 2-2 2, +h) . 2(11 — 12)" 
so when 7; ~ Tz we get a good value when we know /; + /2 very accurately, 
even though /; — /2 might be known with much less accuracy. 
Bessel was one of the first to use these ideas to obtain very accurate mea- 
surements of g and to demonstrate the proportionality of mass and weight 
_,,, mentioned on page 38. 


~~ 


“eg 4. Let D; and Dz be homogeneous 3-balls of the same mass, sliding on a 
frictionless surface. D2 is at rest, while D1 approaches it, not directly head on, 
V1 


D2 

V2 
with velocity v. We assume that the collision is perfectly elastic, and also that 
the surfaces of the balls are perfectly smooth, so that the collision will not cause 
any rotation. Using Problem 3-12 and the assumption that the restraint forces 
at the point of collision are perpendicular to the surfaces of the balls, find the 
resultant velocities vj and Vo. 


260 Chapter 6 


5. An object hangs in equilibrium from a thread. Show that the line of the 
thread passes through the center of gravity of the object. 





6. Consider a rhomboid made of four rigid rods hinged at their ends, each of 
length / and weight w. The rhomboid is suspended from a fixed point, and a 
disc D of radius r and weight W is placed within this rhomboid. The weight 
of the rhomboid tends to make a smaller, while the weight of the disc tends to 
make @ larger. The problem is to find the angle @ at equilibrium. As usual, we 
are assuming that there is no friction between the disc and the rods. 





(a) Let X and Y be the components of the force that AB exerts on AO, and 
O 


Gy 


let N = |N| be the length of the reaction force N that AB exerts on the disc. 
Establish the equations 


(O) I(X sina — Y cosa) + S/wsina = 0 
(B) I(X sna + Y cosa) —rN cota — $lwsina = 0 
(AB) —-X+Nsina+w=0 
D) W —2N sina = 0 


( 
( 


hint: the equation labels indicate the objects on which to compute forces, or 


Constraints 261 


the points about which to compute torques), and use them to deduce that 
Ww 
3 
t ta —2- (1 2—) — 0, 
(*) cot" a + cota : as W 


which has only one real root for cota. The diagram below shows how @ varies 


with (J/r)[1 + 2(w/W)). 


a 
90° 


60° 
30° 
l w 
: ~ (1427) 
(b) Noting that the vertical diameter of the rhomboid has length 2/ cosa, so 


that the distances from the fixed point to the center of gravity of the rhomboid 
and to the center of the disc are 





lcosa and 2/cosa — 


sina’ 


show that V(q), the potential energy of the system for q, 1s 


V(a) = C — 4wl cosa — W (21 cosa — =—] 
sin 


for a constant C. ‘Then use V’(a@) = 0 to obtain (*). 


7. A fixed peg at distance D above the floor is centered above a fixed semicircu- 
lar track of radius R. A string of total length L passing over the peg 1s attached 





to objects of mass m and M; for convenience, we assume that the dimensions 
of the peg and these objects is negligible. Find the equilibrium position. 


262 Chapter 6 


8. (a) For the wheel analysed on pages 224—226, starting at rest at height h 
from the floor, compute the speed and angular velocity at the bottom of the 
inclined plane, and then the rotational energy 7;ot, and conclude that the total 
energy is again M gh. 

(b) Conversely, use conservation of energy to determine the motion of the 
wheel. 


9. A classical paradox involves a rolling rigid weightless hoop with a point mass 
on it (one has to pretend that this is not already a paradox). When the hoop 


\ 


is in the position shown, the potential energy, as well as the kinetic energy, is 
positive. When the point mass reaches the bottom, its potential energy is 0, and 
so is its kinetic energy, since the velocity of the contact point is 0, contradicting 


conservation of energy. Explain. 
¢U10}}0q 9} 
soyovoidde ssew jutod ay} se dooy 9) jo saponaed ay} U99Mj0q S910} JY} 0} suoddey 
yeym Ayeorshyg ¢] uontsodoig ut uonduinsse ye} oy) st yeym ‘ATeonewoyieyy 


10. Analyze the motion of a cylinder rolling down a wedge, which can slide 
without friction on the floor. 


11. Sliding particle. Consider a particle of mass m falling from height h along 
a frictionless circular path of radius /. Let F be the downward force due to 
F, 

Fo 


xe 


gravity, of magnitude mg, and F, the component perpendicular to the path. 
Then the path exerts a force F7 = —F, on the particle, so that the total force 
Fo is F — F,. 


(a) Conclude that the acceleration of the particle, tangent to the circle, has 
magnitude ag = gcos 8, so that 


(a) gu + = cos O=0. 


Notice that this problem is essentially the same as the pendulum problem, except 


Constraints 263 


that the angle @ for the pendulum is measured from the lowest point, so that 
@ = 1/2+ 80, and the equation ¢” + (g//)sin ¢@ = 0 is equivalent to (a). 
(b) One of the classical elementary mechanics problems is to determine when 
the particle loses contact with the path, a fine example of a physics problem 
where the main difficulty is figuring out what the problem is actually saying, 
and, for good measure, where the answer given is often incomplete. 

We need to determine when the total normal force vanishes. ‘The inward 
acceleration is v7/1 (Problem 1-5), while the outward force Fz gives an acceler- 
ation of magnitude g sin 8. So we need 


y2 


a 
gsin 


(compare page 210). Using conservation of energy, show that if the particle 
starts at rest (v = 0) at height h, this happens when the height / sin @ is 2h/3. 
(c) Also solve the problem by multiplying (a) by 0’ to show that 

pi 

aoe S sin 6 = constant, 

2 l 

and conclude that if the particle starts at rest at 09, we have sin 8 = 2 sin 69/3. 
(Of course, this trick is basically the one we used to obtain conservation of 
energy in the first place.) 


We have actually only shown that the total normal force vanishes at 2/3 of the 
original height. ‘Io show that the particle actually leaves the path, note that the 
formula for the total normal force would give a negative value past this point 
if the particle stayed on the path (if our particle were a small loop sliding along 
a circular wire, this would mean that it 1s pulling on the wire instead of being 
pushed by the wire). 


12. Sliding stick I. Consider a stick of mass m and length 2/ sliding along a 
frictionless floor as it falls. By Problem 5-6, the moment of inertia of the stick 





is ml*/3. Since the only forces are gravity acting downward and a reaction 
force F acting upward at the point where the stick hits the plane, the center of 
mass must fall straight down. 


264 Chapter 6 


(a) Show that the total energy after the center of mass has fallen distance y is 


my’ mi? 
= + a + mg(l — y) 





and conclude that the velocity of the center of mass after falling distance y is 


te 6gy sin’ 0 
\ 3sin?6 +1. 


13. Sliding stick II. Consider a stick of length 2/ that is sliding between a fric- 
tionless wall and a frictionless floor. In addition to the downward force of 





magnitude mg, there are forces F; of magnitude F, and F2 of magnitude F» 
acting horizontally and vertically at the ends of the stick. ‘This looks even worse 
than the previous problem, but we can get much more interesting information. 


(a) We have 


mx” = Fy 
my” _ Fy -—g 
ml? 


: 6” = 1F, sin 80 —1F> cos 9, 


and thus 2 
m : 
— 6" = mlsin6x” — ml cos Oy” — mlg cos 6. 

(b) Differentiate the equations x = / cos @ and y = / sin @ to obtain expressions 

for x” and y”, and reduce the above equation to 


3 
OU 4+ = cos 8 = 0). 


Constraints 265 


(c) Use the trick in Problem 11 (c) to show that if the stick starts at rest at angle 60, 
then 


3 
g/2 — = (sin Go — sin 6). 
Conclude that the stick loses contact with the wall (0 = F; = mx”) when 
3sin@ = 2sin 6, 


and thus when its top point reaches 2/3 of its initial height. 

As in Problem 11, the formula for F, would give a negative value beyond 
this point, showing that the stick really leaves the wall, and we could also use 
conservation of energy.! 


Comparison between Problems 11 and 13. Note that the line segment from O 
to (x, y) also has length /, so 


(i) ‘The center of mass (x, y) moves along a circle of radius /, 


(it) Ihe angle between the horizontal axis and the line from O to (x, y) is 
also @. 


Thus the equation in part (b) and equation (a) of Problem 11 differ only by a 
factor of 3/4, and the center of mass of the sliding stick moves V3/2 times as 
fast as the particle sliding down the circle of radius /. 


After the fall. Note that when the stick hits the floor we will have x’ > 0, since 
whenever F, ¥ 0 it is always pointing away from the wall. Consequently, the 
stick will continue to move away from the wall, until stopped by friction. 

If we consider a stick starting nearly vertical and rotating in the other direc- 
tion, or for a convenient experiment, a ruler placed next to a book standing 
upright on a desk, we end up with the same equations, and now x’ > 0 when 





the ruler hits the desk—the ruler cannot simply pivot around the point where 
the book and desk meet, but must end up to the right of the book. 


| Parts of this problem come from the venerable book Osgood [1], which, like Synge 
and Griffith [1], also contains material off the beaten track. 


266 Chapter 6 


14. For the constant @ of a rolling sphere we can choose our first axis so that 
@ = (0,@cos6,w sin @) for some 6. Use our equation v = —Rz xX @ to show 
that as the sphere rolls along a straight line in the plane, the path traced out on 
the sphere is a circle tilted at angle 1/2 — @ to the horizontal. 


Y 


15. Consider a sphere of mass m and radius a rolling on a rotating turntable 
with constant angular momentum Az. 


@ 


Az 


(a) If x(¢) is the contact point at time f, then the velocity v of the center of the 
ball is 


v = AZ— azxX@® 


v’ = Az— azx@’. 


(b) As before, tT = —az x F = —az x mv’, so 
(Se a ey 
I 
and thus 
A 
vo = ———-~—_ +2 XV 
1+ ma?/I 
2 
= -A-ZXV. 
7 


(c) The ball travels in a circle with a frequency 2/7 of the rotation frequence A 
of the turntable. 


Constraints 267 


16. Find the acceleration for an Atwood machine with a massive pulley having 
moment of inertia J by using conservation of energy. 


17. Analyze the motion, including the tensions of the rope, for an Atwood 
machine with a massive pulley having moment of inertia J when the axle of 
the pulley wheel is free to move and a constant force F is applied to it. Here F 





denotes the total force, 1.e., the applied force minus the downward force of 
gravity on the pulley wheel. 


18. (a) Consider a massless filament wrapped around a fixed object, in equilib- 
rium. Let the profile curve of the object be an arclength parameterized curve c, 


T(s + h)t(s + h) 


O(s +h) 






T(s +h)t(s + h) 


Ly T(s)t(s) 


AO = O(s +h) — O(s) > 


i 


with unit tangent vector t(s) = (cos @(s), sin 6(s)), and let T(s) be the tension 
at c(s). For small h, the segment of the filament between c(s) and c(s + h) 1s 
practically a straight line of length h so that the normal force along this segment 
has magnitude close to |N|h, where N is the normal force exerted by the fixed 
object on the filament at c(s). Since the total normal force at c(s) must be 0, 
conclude that 


T(s +h)sin A@ = |N|A + o(h) 


268 Chapter 6 


(things can be made more precise by using the Mean Value Theorem, for those 
so inclined) and then that 


IN| = T(s)6"(s) = T(s)«(s), 


where x is the curvature. 


Of course, the filament exerts an equal and opposite force on the object, and 
this force is greater at points of greater curvature, so a string wrapped around 
a parcel bites in deeper at the edges of the parcel. 


(b) Let AT = T(s +h) — T(s). Since the tangential force at c(s) should also 


be 0, conclude that 
(T(s) + AT) cos AO = T(s), 


and then that T’ = 0. 


19. Consider a cable anchored at two points, and hanging under the influence 
of its own weight, plus possibly other weights attached to it. We will let the 


t(s +h) 


t(s) 


x(s) x(s+h) 


function w be the density of this load, the weight per unit length. The cable 
lies along the graph of a function /, and we will let 7(x) be the tension of the 
cable at (x, f(x)). Let s > (x(s), y(s)) be the arclength parameterization of 
the graph, so that x’(s) and y’(s) are the cosine and sine of the unit tangent 
vector t(s) at (x(s), y(X)). 


(a) ‘The piece of the cable between (x(s), y(s)) and (x(s +A), y(s +/)) is acted 
upon by tension forces at the ends, and forces due to the weight, the later of 
which are all directed straight; down. Conclude that the horizontal component 
of 7 -t is constant, that 1s, 


dx 
(1) 7 = Ty Ty, constant. 


(b) Conclude also that if A/ is the length of the cable between these two points, 
then 
T(x(s + h))y"(s +h) — T(x(s)y(s) = wl): Al 


Constraints 269 


for some & with x(s) < & < x(s +h), which implies that 


d dy 

es Ve 
(2) a 4 ne 
and thus finally 

d (dy\_ w 
+) ds\dx} = Th. 


(c) Consider a suspension bridge, where the cable is regarded as weightless, 
while the load, the road the cable supports, has a uniform weight wo per unzt 
horizontal length, which means that 


dx 
w= Wo. 
"ds 
Conclude from (*) that 
dy _ Wo 
dx? 7 Th? 


so that the graph of f has the shape of the parabola 


Wo 
ae 


2Th, 


PA we 


(d) Using x’(s)* + y’(s)* = 1, conclude that 


3 ce | 
) dx dx } ’ 
and thus by (1) 
dy \? 
T = Thy) 1 —]. 
M+ (Zr) 


If the bridge spans the interval [—S,S] and has height H at these ends, we 
have 7, = woS*/2H, and the maximum of dy/dx on the interval is woS/Th, 
so the maximum of T on the interval turns out to be 


So we have to make H large to keep the maximum tension small cnough to 
insure that the cable doesn’t break. 


270 Chapter 6 


20. Now consider a cable of uniform density w per unit length, which is hanging 
freely under its own weight, with no additional weights. Then (*) and (3) of the 
previous problem give 


Ww 1 i; 
Pa en eet 
h C 


(where the primes now indicate d/dx). 
This can be solved in various ways. Familiarity with the hyperbolic functions 
would suggest writing 
f’ =sinh ov, 


since this reduces the equation to 
| ee | 
coshov-v = -—vV1+sinh* ov = —coshov, 
C C 


or simply 


l 
v(x) = —x + constant. 
C 


It is convenient to choose the lowest point of the cable at the origin, so that 
f'(0) =0 = > v(0) = 0, so that we have 


f'(x) = sinh — —> f(x) = ccosh = —c, 


a catenary. Also find the tension 7. 


Another way to solve the equation, written in classical notation as d*y/dx? = 


(1/c) V1 + (dy/dx)?, is to use the substitution p = dy/dx = > dp/dx = 
d*y/dx*, leading to 


d I kd 
— = v1 +P SN iy pes x =csinh '(p)+C 
x £€ 


V1+ p? 


for a constant C and thus to dy/dx = p = sinh([x — C]/c), a trick apparently 
introduced by Jacopo Francesco Riccati (1676-1754) m 1712. 


21. In the previous problem we essentially found an equilibrium position for 
the cable, so we naturally wonder whether we can also find a solution by looking 
at the minimum point for the “potential function” of y, 


b b 
vin) = | wy ds = [ wyV1+ y/* dx. 


Constraints 271 


The trick here is that we need to find the minimum under the constraint that 
the length of the cable is fixed, 


b 
L(y) = | V¥1+4+(y’)? dx = constant. 


As explained in Addendum 13A, such problems can be solved, analogously to 
Problem 5-2 (c), by finding the Euler equations for the calculus of variations 
problem for V + AL for some “Lagrange multiplier” A. So for 


F(x, y,y)=wyVlt+ty?tavi+y? 


we have the Euler equations 


OF ad (= ; =f 

dy dx \ay' J 
where 0F/d0y and 0F/dy’ simply denote the derivatives of F with respect to 
its second and third arguments, and the terms in the equation are evaluated at 


(x, f(x), f(x). 


(a) In this case, where F doesn’t depend on x, we have “Beltrami’s identity” 


d OF OF d OF 
x, hE ee 
ra ee ee ear ea 


leading to the simpler equation 


OF 


dy’ 
for a constant C. 
(b) For our F, deduce that 
12 (wy ni) een Oca 
ye aay ;) aaa 
and thus 


wy tA=CVl1l4+y”. 


(c) By differentiating the first of these equations deduce finally that 


ti Ww 
= 1+ o 
y C y 


basically the same equation as on page 270. 


An interesting discussion of these and related problems, with historical side 


lights, may be found in Nahin [1 Chap. 5, pp. 240-251]. 


272 Chapter 6 


22. In the figure below, the left part of the figure on page 252, the point E is 
chosen so that AF = AD. Aww 


(a) Show that 
P ED’ CD ED' 


Q DD’ AD DD! 





sin @. 





(b) For small DD’, the line DE 1s practically perpendicular to AD. Conclude 
that, in the limit, 

P 5% 

— = sin’ a, 

0 


and thus 
<0 
a I 
- and OF aoe 
1+ 2sin° a 1+ 2sin“a 


23. Consider a plank of length 4a resting on three knife edge supports, two at 





the ends, and one in the middle, except that now the mass of the plank itself is 
negligible, while there are forces of magnitude W exerted at the middle of each 
section. 


(a) We have 
PX O<x<a 
EIA") = | 
Px —W(x —a) =(P-W)x+Wa a<x <2a. 
(b) Using the fact that f’(a) = 0 and f(0) = 0, show that 
Px? M 
{oS — + & ~ 2? | a on (0, a). 


(c) Using the fact that f’(2a) = 0 and f(2a) = 0, show that 


Pe Wx? Wax? 
«6 6 2 


>» 3f8P 2M 
=) Pa x+a 3.°~«@S;~* on (a, 2a). 








(x) 


(d) Setting these expressions equal at a, conclude that P=2,W, and Q=“W. 


CHAPTER 7 


PHILOSOPHICAL AND 
HISTORICAL QUESTIONS 


his chapter contains several remarks concerning philosophical questions, 
some of which will be relevant to Chapter 10, together with a few tidbits 
of an historical nature. 


Early notions of conservation of momentum. ‘The first statement of the law of 
conservation of momentum might be attributed to Descartes, who asserted in 
his Principles, 1644, that “God in his omnipotence has created matter together 
with the motion and the rest of its parts, and with his day-to-day interference, 
he keeps as much motion and rest in the Universe now as he put there when 
he created it... ” (Dugas [1l; pg. 161]). Arguments of this sort might not get 
a very favorable reception in modern physics journals, but the real problem 
with Descartes’ formulation was that he confused mv with m|v| (since every 
one was thinking of collisions along a straight line at the time, the scalar speed, 
rather than the vector velocity, was the quantity at issue). Partly as a result of 
this, almost all the rules of impact formulated by Descartes are simply wrong, 
a circumstance that Descartes seems inclined to dismiss as due to experimental 
error (see Dugas [1]; pg. 163)). 

It is therefore not surprising that in 1668 the Royal Society proposed a dis- 
cussion on the subject of the laws of colliding bodies, the impetus for the inves- 
tigations of Wren, Wallis, and Huygens mentioned on page 22, which Newton 
reformulated as the third law. Note that when Newton derives conservation of 
momentum from this law (page 22), it is carefully stated so that Descartes’ error 
is corrected—quantity of motion “is determined by adding the motions made 
in one direction and subtracting the motions made in the opposite direction”. 


Huygens and Galilean Invariance. On page 26ff. we were able to present Huy- 
gens’ argument in just a few lines. But Huygens himself gave a much more 
detailed argument, which appeared in his book De Motu Corporum ex Percus- 
stone, published posthumously in 1700. 

Huygens begins by first explicitly stating three hypotheses that he will use in 
his argument: 

Hypothesis I was basically the law of inertia (Newton’s first law). 

Hypothesis II was basically our assumption, on the basis of symmetry, that 
identical bodies moving toward each other with equal speeds must rebound with 


213 


274 Chapter 7 


equal speeds. (Huygens’ hypothesis was actually stronger, and his investigations 
were quite a bit more complicated—see Problem 3-9 for references.) 


Hypothesis III stated (sec Dugas [1; pg. 176}): 


The expressions ‘motion of bodies’ and ‘equal or unequal velocities’ 
should be understood relatively to other bodies that are considered as 
at rest, although it may be that the second and the first both participate 
in a common motion. And when two bodies collide, even if both 
are subject to a uniform motion as well, to an observer who has this 
common motion they will repel each other just as if this parasitical 
motion did not exist. 

Thus let an experimenter be carried by a ship in uniform motion 
and Ict him make two equal spheres, that have equal and opposite 
velocities with respect to him and the ship, collide. We say that the 
two bodies will rebound with velocities that are equal with respect to 
the ship, just as if the impact were produccd in a ship at rest or on 
terra firma. 


Huygens’ presentation of his arguments then begins 


Imagine that a ship is carried alongside the bank by the current of 
a river and that it is so close to the edge that a passenger on the ship 
can hold the hands of an assistant on the bank. ... 


and continues for nearly two pages (see Dugas [1; pp. 177-178] for the complcte 
text), together with a delightfully quaint wlustration. 


we 


ms immo 
ARES Seta al tied HD ace wast MR ConA =: 


MaCatintn) 





Philosophical and Historical Questions 2795 


Huygens’ detailed explanation probably arose from his stance on a con- 
tentious question. ‘The statement in Hypothesis III that 


the expressions ‘motion of bodies’ and ‘equal or unequal velocities’ 
should be understood relatively to other bodies that are considered at 
rest, 


might seem like nothing more than our recognition that the notion of position, 
and thus of velocity, depends on the coordinate system used by the observer, but 
Huygens, like Leibniz and Descartes and his followers, maintained that only 
relative motion had a meaning, whereas Newton felt that one had to resort to a 
notion of “absolute space”, objecting that otherwise velocities could be assigned 
arbitrarily, making it meaningless to say that a body unacted upon by forces has 
constant velocity. 

Nowadays we avoid, or at any rate hope we have managed to avoid, the whole 
problem by stating the first law as on page 11, in terms of the exzstence of an 
inertial system. No doubt, however, objections to this approach can be raised 
also. From Newton’s day on, there have been extensive arguments about this 
matter, all of which might be characterized as being of a philosophical nature. 
Without meaning to assign too pejorative a meaning to that term, let us simply 
say here that between the two viewpoints there was no disagreement on actual 
experimental results. Newton certainly wouldn’t have disputed the claim that 
any laws of mechanics that we discover in the one coordinate system ought to 
hold just as well in the other, because this is, in fact, an immediate consequence 
of Newton’s laws, the crucial point being that the second law involves only v’, 
and not v. The only question is whether we want to give prominence to the 
claim, and note that it implies that Newton’s laws should involve only v’, and 
not V, or instead regard the claim as a consequence of Newton’s laws.! 

In any case, whether we decide to note it as a consequence of Newton’s laws, 
or regard is as a fundamental assumption, the basic notion that the laws of 
mechanics will appear the same in two coordinate systems moving at uniform 
velocity with respect to each other 1s nowadays often called the Galilean relatiwity 
principle, and Huygens’ argument, despite its limitations, is certainly an alluring 
application of this principle. 

Galileo, of course, didn’t use the term “relativity principle” —that terminology 
was introduced only after the appearance of “Einstein’s principle of relativity” — 
but he did enunciate the principle quite explicitly, and argucd it in great detail 


' Actually, there’s a whole other aspect of this argument, which we touch upon at the 
end of Chapter 10. 


276 Chapter 7 


in Galileo [1; pp. 186-88 of the University of California Press edition], a very 
amusing account, too long to quote here, from which we give a short extract: 


Shut yourself up with some friend in the main cabin below decks on 
some large ship, and have with you there some flies, butterflies, and 
other small flying animals. With the ship standing still, observe care- 
fully how the little animals fly with equal speed to all side of the cabin. 


in throwing something to your friend, you need throw it no more 
strongly in one direction than another, the distances being equal; jump- 
ing with your feet together, you pass equal spaces in every direction. 


[then] have the ship proceed with any speed you like, so long as the 
motion 1s uniform and not fluctuating this way and that. You will 
discover not the least change in all the effects named, nor could you 
tell from any of them whether the ship was moving or standing still. 


The cause of all these correspondences of effects is the fact that the 
ship’s motion is common to all the things contained in it, and to the 
air also. ... 


Galileo used this argument to explain why naive objections to the Coperni- 
can system—that if the earth rotated from west to east, then cannonballs shot 
eastward should fly further than those shot westward, or that objects falling 
from tall buildings should end up to the west of the building when they hit the 
earth—were mistaken, and that book was the one that got him into trouble with 
the Inquisition. 


Newton’s proof of the third law. ‘The little experiment with a magnet and piece 
of iron described on page 23ff. was inspired by something in the Principia, 
though it may well be the silliest thing that Newton ever said, at least among 
scientific statements. 

After his description of his pendulum experiments mentioned on page 22, 
which involved the repulsive forces of collisions, Newton also wanted to say some- 
thing about attractive forces, since he had gravity in mind. So after three pages 
describing his careful experimentation, he immediately adds the following para- 
graph, concerning two bodies separated by an interposed obstacle: 


a interposed object 





Philosophical and Historical Questions 277 


I demonstrate the third law of motion for attractions briefly as fol- 
lows. Suppose that between any two bodies A and B that attract each 
other any obstacle is interposed so as to impede their coming together. 
If one body A is more attracted toward the other body B than that 
other body B is attracted toward the first body A, then the obstacle will 
be more strongly pressed by body A than by body B and accordingly 
will not remain in equilibrium. The stronger pressure will prevail and 
will make the system of the two bodies and the obstacle move straight 
forward in the direction from A to B and, in empty space, go on indef- 
initely with a motion that is always accelerated, which is absurd and 
contrary to the first law of motion. ... 


Thus, after three pages of careful experiment, Newton provides a one para- 
graph theoretical argument, and this argument is patently nonsense! ‘The first 
law is concerned with the force on one body, not on a “system” consisting of 
more than one body. Moreover, the whole argument depends on the fact that 
the “interposed” object is rigid, so that it keeps A and B separated, and of 
course an analysis of rigid bodies presupposes the third law. Finally, we might 
note that the same argument could just as well be made to work for repulsive 
forces: 





What’s even more amazing is that Newton actually described an experiment 
made to test this idea, using vessels floating on water instead of an air trough 
to reduce friction: 


I have tested this with a lodestone and iron. If these are placed in sep- 
arate vessels that touch each other and float side by side in still water, 
neither one will drive the other forward, but because of the equality of 
the attraction in both directions they will sustain their mutual endeav- 
ors toward each other, and at last, having attained equilibrium, they 
will be at rest. 


Considering the great importance that Newton attached to accurate experi- 
ments, his claim to have performed this experiment presumably should be taken 
at face value. But what a truly strange negative experiment this must have been! 
Would any one really expect that the lodestone and iron would continue mov- 
ing together forever, with a motion that is always accelerated?!! On the other 
hand, what a revealing experiment we obtain when we remove the “interposed” 
object. 


278 Chapter 7 


The parallelogram law. Some notion of the parallelogram law, at least in terms 
of the composition of motzons, seems to date back at least to Aristotle (cf. Dugas 
(1; pg. 21]): “Let a moving body be simultaneously actuated by two motions 
that are such that the distances traveled in the same time are in a constant 
proportion. Then it will move along the diagonal of a parallelogram which has 
as sides two lines whose lengths are in this constant relation to each other.” 


And here is Newton’s statement, and proof: 


A body acted on by [two| forces acting jointly describes the diagonal of a 
parallelogram in the same time in which it would describe the sides if the 
forces were acting separately. 


Let a body in a given time, by force M alone A B 

impressed in A, be carried with uniform motion 

from A to B, and, by force N alone impressed 

in the same place, be carried from A to C; then C D 
complete the parallelogram ABDG, and by both forces the body will 
be carried in the same time along the diagonal from A to D. For, since 
force N acts along the line AC parallel to BD, this force, by law 2, 
will make no change at all in the velocity toward the line BD which 1s 
generated by the other force. ‘Therefore, the body will reach the line 
BD in the same time whether force N is impressed or not, and so at the 
end of that time will be found somewhere on the line BD. By the same 
argument, at the end of the same time it will be found somewhere on 
the line CD, and accordingly it is necessarily found at the intersection 
D of both lines. And, by law 1, it will go with [uniform] rectilinear 
motion from A to D. 


Even before we reach any questionable steps, we see from the very first phrases 
that Newton is framing this proof in terms of impulsive forces, since he states 
that the forces M and N individually produce a uniform motion on the ob- 
ject. The remaining part of the argument, with its claim that the force N “will 
make no change at all in the velocity toward the line BD which is generated by 
the other force”, really requires for its justification certain remarks that New- 
ton makes after the statement of the Second Law; the text, as quoted at the 
bottom of page 12, 1s immediately followed by an amplifying paragraph that 
concludes “ ... if [the force] was in an oblique direction [to the direction of a 
moving body], [it] 1s combined obliquely and compounded with it according 
to the directions of both motions.” In this completed form, the statement of 
the Second Law practically includes the parallelogram law! At best, however, it 
simply demonstrates this result if we regard the body as having already been set 


Philosophical and Astorical Questions 279 


in motion by force M, with the second force N applied a bit later. Essentially 
Newton observes that the result holds when N js applied shortly after M, or 
visa versa, and concludes that it holds when they are applied at the same time. 
Of course, one can say that this 1s a very reasonable assumption, but that just 
replaces one axiom with another! 

Moreover, it’s certainly not clear how one would apply this assumption to the 
case of continuous forces, and although Newton’s proof is formulated in terms 
of impulsive forces, he clearly means to apply it to continuous forces also. In 
fact, in his Scholrum he mentions Galileo’s observations on the parabolic shape 
of a projectile’s path as an illustration of this rule for compounding forces, and 
even goes so far as to provide a little picture: 


For example, let body A by the motion of projection alone describe the 
straight line AB in a given time, and by the motion of 
B___falling alone describe the vertical distance AC in the 
A same time; then complete the parallelogram ABDC, 
and by the compounded motion the body will be 
found in place D at the end of the time; and the 
curved line AED which the body will describe will 
D bea parabola which the straight line AB touches at 
C A and whose ordinate BD is as AB?. 


Here, of course, we are considering, on the one hand, an impulsive force, which 
gives the object its uniform horizontal motion, and, on the other hand, the force 
of gravity, which gives the object its non-uniform vertical motion. And indeed 
this really illustrates only that the action of a force on an object is independent 
of the object’s uniform velocity, which was Galileo’s basic observation. 

Newton’s questionable proof of the parallelogram law apparently stimulated 
the search for more convincing arguments, a somewhat quixotic enterprise, 
since we are trying to provide a mathematical “proof” for a non-mathematical 
question. It is probably better to rephrase it as an investigation into whether the 
parallelogram law can be deduced from other assumptions that we might con- 
sider more basic. Daniel Bernoulli offered one of the first such demonstrations 
in 1726. Other proofs appear in Laplace’s great Traité de Mécanique Céleste of 
1799 and in Poisson’s Trazlé de Mécanique of 1833, and a short proof was offered 
by Hamilton in 1841, to name just some of the more illustrious contributors to 
this question. 


' Pourciau [5; pp. 161-163] and [6] gives a rather different interpretation to the whole 
historical problem that we are discussing, involving an investigation of what Newton 
“really meant” by his statement of the second law. 


280 Chapter 7 


All these proofs are shrouded in a somewhat impenetrable veil of unstated 
assumptions, making their decipherment rather difficult, especially since even 
mathematical results were often stated rather vaguely in those days. More sig- 
nificantly, all these proofs have one feature in common: they use, but carefully 
do not state, the one essential hypothesis without which no conclusions can 
possibly be drawn. 


To illustrate this point, we will indicate the first part of Bernoulli’s proof: 
To simplify the discussion, let us restrict ourselves to R*, so that we are only 
considering forces in one plane. As we already mentioned long ago, in the 
footnote on page 29, the first basic assumption is that two forces v, w acting 
together have the same effect as some other force. ‘Thus, we are assuming that 
for each pair v, w € R* we have another element v @ w € R?. We presumably 
shouldn’t object to assuming that v © w = w @ v and also equations like 
v @®v = 2v (compare Problem 1-25). We will also need an hypothesis that 
expresses our experience that the laws of nature are invariant under orthogonal 
maps: 


If T: R? > R? is any orthogonal map, for the usual vector space struc- 
ture of R*, then 


T(v ®w) =T(v) @T(W) for all v, w € R?. 
Bernoulli begins by considering two perpendicular vectors, v, of length a, 


and v2 of length b, with x being the length of v; ® v2. Let w2 be the vector 
on the line perpendicular to vj ® v2 with length a0 and let w, be the vector 





along v; @ v2 of length ie Since 


length wi = . ‘ length Vi 
length w2 = ¢ - length v2 
length v} = + -length(v; @ v2), 


Philosophical and Historical Questions 281 


and the angle a from w2 to v; equals the angle from wy, to vo, there is an 
orthogonal map T—involving a rotation through the angle a, together with a 
reflection—such that 


<T(vi) = 1 
zt (Vo) = W2 
“T(vi @ v2) = V1. 


Consequently, our hypothesis of invariance under orthogonal maps implies that 


(1) WwW, BW = V1. 


ae é 2 
But similarly, we can consider Z; and Z2 of lengths ba = ae and a respec- 


Vi BV2 





tively, and conclude that we have 


(2) Z1; BZ2 = V2. 
Since W2 = —Z2, equations (1) and (2) give 
(3) Wi ©Z1 = V1 @ Vo. 


But w, and z, lie along vj @ v2, so the length of v1 @ v2 is the sum of the 
lengths of w; and z;, which means that 


a* bb? ie 
—+—sy = a+b a=x* = x= a* +b’, 
x x 


and thus v1 ® v2 = ae; @ bez has length Va? + b, which is the length of 
ae, + beo. 


282 Chapter 7 


Thus, Bernoulli has demonstrated that v1; ® v2 has precisely the length you 
would expect it to have, in the special case that v,; and v2 are perpendicular. 
He then proceeds by involved arguments to prove the complete result, for the 
general case. 


The one point that 1s usually ignored is that in our quick trip from equations 
(1) and (2) to (3), we had to use associativity of ©, which is likewise used in all 
the other proofs that have been fashioned. But if we assume associativity of ®, 
then everything is essentially trivial: Consider the map 


(a,b) = ae, + bez +> ae, @ bed. 


If @ is associative, then this map will be linear. But it takes e; to e; and e2 
to €2, so it must be the identity. Q.E.D. The somewhat convoluted discussion 
in Mach [1; pp. 55-57] essentially boils down to the same point. Note also that 
there is no reasonable way to verify associativity experimentally without already 
being able to measure what v © w is. 


Newton at the hands of the scholars. As pointed out in Addendum 4A, almost 
none of Newton’s contemporaries realized the significance of Proposition 41 of 
the Principia, which amounts to the modern solution to the problem of inverse 
square forces, while the geometric proof that he first supplied confused them by 
its brevity. In fact, the particular form in which Newton chose to present his 
proof has spawned all sorts of scholarly arguments, which began in his day and 
have continued into ours. As we mentioned in Chapter 2, in the first edition 
of the Principia (1687), the Corollary on page 62 was simply stated, without the 
remaining two explanatory sentences. ‘These were added to the second edition 
(1713), and we have a letter of 1709 1n which Newton instructs the editor to 
supply them. 

This date is significant because in 1710 Johann Bernoulli criticized the Corol- 
lary on various grounds, including the assertion that it basically amounted to 
assuming the converse of a proposition on the basis of the proposition itself 
Skipping over the acrimonious disputes that arose,’ we merely note that in 
1719, Bernoulli wrote Newton an apologetic, not to say obsequious, letter that 
said in part 


Gladly I believe what you say about the addition to Corollary 1, Propo- 
sition 13, Book One of your incomparable work, the Principia, that 
this was certainly done before these disputes began, nor have I any 


' For a thorough treatment, see Guicciardini [1]; by the way, pg. 546 of this paper points 
out that Newton definitely knew how to evaluate the integrals required for a direct proof. 


Philosophical and Historical Questions 283 


doubts that the demonstration of the inverse proposition, which you 
have merely stated in the first edition of the work, was yours; I only ... 
wished that someone would give a [direct proof]. This indeed, which 
I would not have said to your displeasure, I think was first put forward 
by me, as least so far as I know at present. 


Actually, Bernoulli was doubly wrong on this account, as Newton had already 
provided the proof in Proposition 41 of the Principia, nor did Bernoulli even 
have priority for a published specifically analytic proof, as pointed out in Guic- 
ciardini [1]. 

Bernoulli’s letter may be found in Newton [1, Vol. 7, p. 77], where the real 
reason for writing it is revealed a bit later. 


Now one would think that this would have settled the matter! But Newton has 
continued to be faulted by various physicists. Wintner [1] offhandedly credited 
Bernoulli with being the first to prove that the paths are conics, and more 
recent challenges have led to all sorts of embarrassing scholarly cat fights,! which 
should have been quite unnecessary since the third edition makes the argument 
quite clear to any one with mathematical instincts. 


It might also be mentioned that Arnold addressed this question in a very dif- 
ferent way, by arguing that Newton knew “in essence” a short theorem—with a 
neat one-paragraph proof, Arnold [3; pg. 32]—which immediately provides the 
desired result. Arnold might have been overly generous in attributing knowl- 
edge of this result to Newton, or perhaps he was simply singularly in tune with 
Newton’s mode of thought. 


' See Cohen and Whitman []; pp. 135-136] for many references; Pourciau [1] is cited 
as a corrective antidote, and further details are added in Pourciau [2]. 


PART II 


BUILDING ON 
THE FOUNDATIONS 


CHAPTER 8 
OSCILLATIONS 


The investigations that are to follow will teach us nothing 
new about the principles of mechanics. So great, however, is the 
significance of oscillation processes for physics and engineering 
that their separate systematic treatment is deemcd essential. 


— Sommerfeld, Mechanics 


hus opens the third chapter, on oscillation problems, in the famous book 
on Mechanics by Arnold Sommerfeld. Actually, various results about os- 
cillations, and about the equations defining them, are needed in succeeding 
chapters, but we will proceed somewhat in the same spirit, first discussing some 
specific interesting mechanical systems, then progressing to topics of greater 
generality, which can often be applied to mechanics itself. 
One sort of oscillation is already familiar to us. In our analysis of the pendu- 
lum moving through a small angle in Problem 1-21, we have already encoun- 
tered the equation 


x" +x =0, 


with oscillatory solutions 


a = x(0), 
b = x'(0)/o. 


x(t) =acoswt + bsinat 
This is known as “simple harmonic motion”, and we should note that it can 
also be written in terms of the amplitude A and the phase ¢, as 


WA +d; A= (PEE. ead= 


More systematically, we can seek a solution in terms of the exponential func- 
tion x(t) = e*', giving the equation 47 + @7 =0 = > A = tia, noting 
that Ce'®! + De! will be real if C and D are conjugate; the simplest choice 
C = D =a/2? for real a gives acoswt, while C = —ib/2 = —D for real b gives 
b sin wt. 

‘The oscillating motion repeats itself over a period T of 22/w, so that the 
Jrequency v = 1/T is v = w@/2x. The term carcular frequency is often used for 
w@ = 20Vv. 


287 


288 Chapter 8 


Huygens’ cycloidal pendulum. In Problem 1-21 we had to restrict ourselves to 
oscillations through small angles because the pendulum is not zsochronous (its 
period is not constant), but merely close to isochronous for oscillations through 
small angles, which can be rather inconvenient in the design of pendulum clocks. 
This problem was handled in an ingenious way by Huygens, who has been men- 
tioned in Chapters | and 7 and who will also make an important appearance 


yy y 
length a@ Ga 
Ke x x 
ad 
cycloid upside-down cycloid 
x =a(é —sin 8), x =a(é —sin8@), 
y =a(1—cos@). y =a(1+cos8). 


later on. Huygens discovered that an upside-down cycloid is tautochronous—if a 
particle starts at rest from any point, the time taken to slide down to the bottom 
is always the same. So a pendulum bob sliding on a perfectly smooth cycloid 
would have the same period no matter how large or small the angle through 
which the pendulum bob slid. Huygens managed to obtain the equivalent of this 
frictionless situation by means of another wonderful fact he discovered about 
cycloids: A cycloid has an “involute” that is another cycloid of the same size: If 
we tie a string of length 4a to the top vertex of a cycloid and pull it taut against 


the cycloid, then as we pull it out tangent to the cycloid we obtain a congruent 
cycloid, basically just like a simple pendulum swinging along a circular arc. It 
was Huygens who introduced the notion of involute, as well as the related notion 
of evolute, treated geometrically in terms of envelopes, see Addendum B and 
Problem 2. 

Though we now have many alternatives to pendulum clocks for accurate 
measurements of time, Huygens’ investigation inspired Abel’s pioneering work, 
described briefly in Addendum A, in the subject of integral equations. 

Huygens originally proved the cycloid was tautochronous by complicated geo- 
metric arguments, which can be found in English translation in Huygens [1] or 
on Ian Bruce’s wonderful web site 17centurymaths.com. The analytic proofs 
one encounters nowadays are somewhat opaque, but we can give a proof that 1s 
probably close to the spirit of Huygens’ arguments by using a fact about cycloids 
that is most easily proved geometrically. 


Oscillations 289 


In the figure below, (a) shows the position that the top point P of the circle 
has moved into after the circle has rolled a certain amount. Since the circle 





is rolling, we know from the Proposition on page 222 that “up to first order” 
the motion of P is now simply rotation about C, which means that the tangent 
line to the cycloid at P is perpendicular to CP, as in (b); and since an angle 
inscribed in a semi-circle is right angle, and conversely, this means that the 
tangent line goes through the point directly above C. Finally, from (c) we easily 
see that if the circle has rotated by the angle @ to get to this position, then the 
angle between the tangent and the horizontal line above the circles is just 0/2. 
Turning this picture upside down, this means that for an upside-down cycloid, 


slope 0/2 


after the circle rotates through an angle of @ from the bottom, the tangent line 
has slope 6/2; an awkward analytic derivation can be found in Problem I. 

Now if we let s(@) be the length of the cycloid as a function of 6, then from 
the parameterization on the previous page we get 


s'(9)* = x'(6)? + y'(6)? = 2a7(1 — cos 8) 
= 4a*sin*(6/2) _ by the half angle formulas. 


So the length from the bottom—reached after rotation through the angle 2 —to 
the point where the circle has rotated through an additional angle of 6, is 


a+ ax+0 
s(0) =| s'(0) dO =| 2a sin(6/2)d0 = 4asin(@/2). 


But since the slope of the tangent line is 6/2 at this point, a particle sliding down 
the cycloid, with s(t) its distance from the bottom, satisfies the equation 


s"(t) = —g sin(6/2) = - = - s(t), 


and we have simple harmonic motion with a period of 27 /4a/g = 41 Va/g, 
independent of the the point from which the particle begins sliding; the time of 


descent is always 27 /a/g. 


290 Chapter 8 


The spherical pendulum. The spherical pendulum, introduced in Problem 3-5, 
is a very valuable example, which wil be used to illustrate important points 
later on, especially in Chapters 12 and 21. An exact analysis involves elliptic 
functions, just as for the ordinary pendulum, but we can give an analysis that 
describes the basic features of the motion, which will have some interesting 





parallels in Chapter 9. Regarding our pendulum bob as a particle c(t) = 
(x(t), y(t), z(t)) of mass m, Problem 3-5 gives 


(a) Ly =x SC 
for a constant C, while conservation of energy gives 
(b) Amv? +mgz= E. 


In terms of the spherical coordinates 6 and ¢, with 


x =Ilcos¢sin§ x’ =160'cos¢cosO —/¢’' sin ¢ sin 8, 
y=lsindsnd = ~ y’=/16'sindcos6+/1¢' cos¢sin§, 
z =1(1 —cos6) z’ =10'sin 0, 

we find that 


y2 = x/2 oe ye a 7/2 = eal ie at g’? sin? 9) 


and we can write equations (a) and (b) as 


(a’) I*' sin? 9 =C 
(b’) +m1?(6’? + g’* sin? 6) + mgl(1 — cos 6) = E. 
The substitution 
—u 
u = cos 0, Q’ = —___— 
V1 — u2 
leads to 
12 2 2 C* 
uo = aaa —u°)(E —mgl(1 — u)) — 74 = f(u), say, 


Oscillations 291 


which means that u = cos@ can only have values where the cubic f(u) > 0. If 
the pendulum isn’t swinging in a plane, then C £0, so f(u) < 0 for u = +1, 





and thus u lies in some interval (v1, u2) with —1 < uy < u2 < | (the figure, 
with uz < 0, is for a pendulum whose swing remains below the horizontal). 
Differentiation of our equation for u’? leads to a second order equation 


2 
‘ ie Bak : u 
2u” = f'ou or, in Leibnizian notation, 23 = f'(u), 


of the very form that we encountered in Chapter 4 (see page 128), and the 
pendulum, when viewed from above, exhibits the same characteristics as an 


elliptical orbit, with the height /(1 — cos@) = /(1 — u) of the pendulum bob 





varying between / — uy, and / — uz. (Problem 1-20 covers the case uy = uz.) 

If; as in the case of the regular pendulum, we want to consider small oscil- 
lations of the spherical pendulum, the coordinates ¢ and @ don’t seem very 
promising for our analysis, since they don’t play a symmetric role. Indeed, 6 
really wasn’t a good choice for the regular pendulum problem, for we had to 
replace the sin @ by @ in Problem 1-21 in order to get a linear equation, and as 
Problems 1-22 and 3-16 indicate, it will be easier simply to use the coordinates x 
and y instead. ‘The method of Problem 3-16 (with z now playing the role of y 
in that problem) leads to the equation 

x" + Sxty" + Sy =0, 

and we can then simply solve x” + (g/l)x = 0 and y”+(g/l)y = 0 separately, 
each as linear combinations of sinwt and coswt for w? = g//, and obtain 
any given initial conditions by an appropriate combination of these solutions. 
Problem 2-4 implies that the (x, y) component of the path of the pendulum bob 


292 Chapter 8 


is a small ellipse, which might seem like a poor approximation to the rotating 
elliptical path previously described. Unlike the case of the ordinary pendulum, 
where we definitely had an oscillating motion, which we then approximated 
to first order by a harmonic oscillation, in the case of the spherical pendulum 
we have made a first order approximation to the whole problem at the outset, 
before identifying a specific oscillatory motion to which our approximations 
should apply, so we are only studying small oscillations of an approximation, 
not harmonic oscillatory approximations to actual small motions! 


Springs. In addition to the beautiful oddity of the Huygens pendulum, oscilla- 
tions with constant period are also produced by springs, which obey Hooke’s 
law, briefly mentioned on page 47 as well as in Addendum 6B. For a spring with 
one end fixed and unstretched length /o, there is a “spring constant” k > 0 so 
that when the string 1s stretched to length / the force on the end of the spring 
is —Kx for x = 1 —_Io. This is true only for |x| for which J/g + x is within a 


eo pa ri) 
lo 


lo +x lo —x 


certain “elastic limit”. However, this range is far greater than the range for 
small oscillations of a pendulum, and within this range the proportionality is 
quite accurate (though spring stretching is actually quite complicated, involv- 
ing a twisting of the spring around its axis, and a reshaping of the spring; the 
stretching of a wire as discussed on page 253 is simpler, at least conceptually.) 
For an object of mass m attached to a spring of negligible mass and sliding 
on a frictionless surface, moving it further from its equilibrium point sets up a 


— 


motion described by the equation 
x" =—Kx, K=k/m. 


Or we could eliminate the problem of friction by using two springs of the 
same construction to suspend the weight between two walls, where K will now 


rr 


have twice the value, if we neglect the sagging due to gravity, or work in a 
weightless environment. We could also simply hang the weight from one end of 


Oscillations 293 


the spring, which is suspended from the ceiling at the other end; in this case, 
we should measure x as the displacement from the equilibrium point, rather 


eae 


than from the unstretched length of the spring. In all cases we end up with an 
equation of the form 
x” + Kx =0, K>0, 


whose solutions we mentioned at the beginning of the chapter. 
For 
w= VK/m, 


if we choose the potential energy U to be 0 at x =0, or sn(@t + @) = 0, and 
hence cos(wt + @) = +1, the total energy F at this time is the kinetic energy 


x(t) = Asin(wt + ¢) 
x'(t) = Aw cos(at + ¢) 


ee ee Oe a (ey 
zmv* = 5m|Awcos(at + @)|° = smA*w = ZAK, 


so that FE = +A? at all times. Conversely (Problem 3), the method of deriving 
conservation of energy can be used to obtain our solution for x(t). 


Harmonic oscillations. Our spring example is about the only mechanical one 
giving rise to simple harmonic oscillation not restricted to small oscillations, 
but simple harmonic oscillation occurs in other important physical systems. In 


A mrQ 


particular, for a simple circuit involving an inductance L attached across the 
plates of a capacitance C, the charge q satisfies 


Lq" +q/C =0, 


so that it varies sinusoidally, over a large range of values, which is why it so 
easy to get a beautiful sine curve on an oscilloscope, by sweeping vertically with 
such an oscillating voltage while sweeping horizontally with uniform speed. In 


294 Chapter 8 


fact, many investigations of simple harmonic oscillation are made with electrical 
examples in mind. 

By using the output from two different circuits to sweep horizontally and ver- 
tically on an oscilloscope, we can easily get a system whose two coordinates 
each exhibit simple harmonic oscillation, 


x1(t) = A; cos(@it + ¢1) 
x2(t) = Az cos(wet + ¢2), 
producing so-called Lissajous figures. When @, = @2 the resulting figure, inside 


the rectangle [—A,, Ai] x [—Az2, A2], is always a (possibly degenerate) ellipse 
(Problem 2-4 again). ‘The figure below shows several cases, where, setting 


ZION 


5 = 2 — $1, we have shifted the time parameter so that the equations can 
be written in terms of the phase difference 6 as 


x1(t) = A; cos(wt) 
xX2(t) = Az cos(wt + 3d); 


the arrows on the paths indicate the direction of the path for increasing t. The 
next figure shows the results as 6 increase to 27, and the bottom figure shows 


Soe 


4 2 4 


several results for @2 = 2@1, where the first and last curves are parabolas. 





Oscillations 295 


The shape of the Lissajous figure is very sensitive to the ratio of @; and @2. 
In general, if w, and w2 are commensurable, then their ratio is the ratio of the 
numbers of tangencies of the figure along a horizontal and a vertical side of the 
rectangle, except for the cases where the figure enters a corner. If wg is just 





a bit larger than @,, then the segments for [0, 27], [27,47], ... , are close to 
ellipses, but they keep rotating, so the Lissajous curve slowly rotates around, and 
if w; and w2 are not commensurable the Lissajous figure 1s not periodic, and its 
image is dense in the rectangle. Lissajous figures on an oscilloscope thus give a 
good test of whether two frequencies are the same. Lissajous himself, before the 
days of oscilloscopes, tested whether two tuning forks had the same frequency 
by directing a narrow beam of light onto a tiny mirror glued to one tine of the 
first vibrating tuning fork, which reflected it to a mirror on the second vibrating 
tuning fork, which in turn reflected it to a screen. 


Damped oscillations. In practice, of course, springs never provide truly har- 
monic oscillation. Even if we ignore the fact that springs are not massless, there 
is always some outside force, like the friction of the moving weight on the floor, 
or the resistance of the surrounding air or a fluid, that tends to slow the oscilla- 
tion down. Moreover, there will generally be internal factors, like the fact that 
the spring heats up, that act similarly. It 1s customary to consider the case where 
the total “damping” force is proportional to the velocity. ‘The frictional force 
of a moving weight on a floor is not proportional to velocity, but a resistance 
proportional to velocity does describe fairly well the case of air or fluid friction 
if the motion is slow enough, and internal factors also often act this way, as in 
the case of the tuning fork and the rubber band on page 298. Perhaps most 
important of all, the oscillatory behavior of the charge q in electrical circuits 
often has this character. 
Instead of our equation x” + @*x = 0, we will now write 


x" + 2px’ + wo*x = 0, p> 0, 


where wo is the “natural” circular frequency that we would have without the 
damping force, and the factor 2 is inserted to simplify some algebra. If we try 


296 Chapter 8 


for a complex solution that is a multiple of x(t) = e®’ we obtain 
w* +2pw + wo” = 0 


with the two roots 


@1 =—p + Vp? — wo, 
2 = —p— Vp? — @o’, 
2 


and the nature of the solution depends on the sign of p* — wo”. 


p < wo, “underdamped”. Letting wm = Vo — p”, we have 
x(t) bes e PF ( Ae! ef Bei). 


giving the solutions e~°' (a coswt +bsinqwt). It is customary to speak of this as 
an “oscillation” of circular frequency w; the zeroes of the solutions are spaced 
apart by 2/qw, although the maxima and minima are not equally spaced be- 
tween them. The figure below shows the basic solutions involving cos and sin 
alone. 


me 


p > @09, “overdamped”. Now @ and @ are real, negative, and distinct, and 
the solutions are linear combinations 


x(t) = ae®!® + be®2! 





of two exponential decays. When p* — @o” is large, the root @, is close to 0, 
and there will be solutions with a component ae®!' that decay only very slowly. 
As we make p” —@ 9” smaller this becomes less pronounced, which gives special 
importance to the final case: 


Oscillations 297 


p = 9, “critically damped”. ‘Then we have only one root w = —p, giving us 
only the solutions x(t) = ae~®’. The standard way to guess a second solution 
is to consider the underdamped equation with wo* — p* = « for small ¢, and 
note that the averaged solution 


xe (le —e'*] approaches te’ ase—>0, 
i€ 


revealing the general solution 


x(t) = ae~®’ + bte~?*. 





In many mechanisms, one wants a quick return to a steady position after an 
initial displacement. For example, an electrical meter should give a steady read- 
ing shortly after it has been connected to a circuit or a switch has been closed, 
hydraulic and pneumatic spring returns for doors need to close the door rea- 
sonably quickly without hitting the door frame so hard that the door bounces 
back, shock absorbers need to return a car bumped by the road to its initial 
position without causing the car to oscillate up and down. In such cases, the 
mechanisms are designed to have a damping constant just a little larger than 
critical damping. 


Returning to the underdamped case, if the damping is small, 
x(t) =e *'Acos(wt + $), p <a, 


many oscillations will occur in a period of time during which the e®’ term 
varies only slightly, so at any time f we practically have simple harmonic motion 
of amplitude Ae~?! with energy E(t) = 5«A?e~?"! = E(O)e~7"', decaying 
exponentially. While w/m ~*~ wo/z is the reciprocal of the “semi-period” (the 
time between two zeros of the decaying oscillation), 2¢ is basically the reciprocal 
of the time it takes for the energy to decay by a factor of 1/e. The quotient 


is a “dimensionless” number! called the quality of the oscillation; an oscillation 
with high Q loses very little energy per oscillation. 


| Alternatively, all terms of the equation mx” + mwo?x + 2mpx' = 0 must have the 
dimensions of force, MLT~?. For mw@o2x, where mx has the dimensions ML, this means 
that wo” must have dimensions T~7, so wp has dimensions T7!. Similarly, for 2mpx’, 
where mx’ has dimensions MLT~!, this means that p must have dimensions T~!. 


298 Chapter 8 


As an example! of the significance of Q, a standard tuning fork (A above 
middle C) has a frequency of 440 cycles per second. Using a decibel meter for 
a rough approximation, the intensity of sound was found to decrease by a factor 
of 5 in 4 seconds. ‘This means that 4 x (2p) = log 5 = 1.6, so 


2p = 0.4 


By contrast, a paperweight suspended from a sturdy rubber band was found to 
have a period of 1.2 seconds and the amplitude of oscillation decreased by a 
factor of 2 after three periods, giving 


20% .39 
wo 2w/T 2n/1.2 


ox, ~ 039 ~ 039 








oth ee 


In both cases, the air resistance makes only a small contribution to the damping 
factor—most of the energy loss occurs internally, showing up as heating of the 
metal or the rubber during the vibrations. The damping factor 29 is nearly 
the same in both cases, but the tuning fork has a much greater Q because it 
oscillates much more rapidly, so it has a much lower loss of energy fer cycle, 
which is what Q measures. 


Forced oscillations. ‘he equation 
x(t) + wo*x(t) = F(t) 


describes a situation where an external “driving” force F is being applied, in 
addition to the force of the spring or other appropriate feature of our system 
that provides the wo*x(t) term; note that here F already has the mass m of our 
object divided out, so it actually has the dimensions of acceleration. ‘The general 
solution of this nhomogeneous equation is a linear combination of any partic- 
ular solution and the solutions of the homogeneous equation x” + wo*x = 0. 
Although a solution can be found for arbitrary F by the method of “variation 
of parameters” of elementary differential equations, the main case that interests 
us is when the driving force F 1s itself oscillating, so that we have the equation 


x(t) + wo*x(t) = csinot 
for some @, the circular frequency of the driving force. 


! Kleppner and Kolenkow [I] 


Oscillations 299 


If we try for a particular solution of the form 
x(t) =csnot, 


we find that 
Cc 
Cc = Coe ee 
Bee oe 


and our general solution is 


x(t) =acos@ot + bsinwot + csinat, 


where a and b are determined by the initial conditions, while c is a constant. 
Note that if we write the general solution for @ = wo + € as 


x(t) a Ae! ot + Bei@ortet an [A ii2 Be'® Je! 08 
then for € < wo the factor A + Be'® will vary only slightly over the period 
2n/wo of e'', so we have something like oscillations of period 27 /@p» with 
varying amplitude |A + Be!®|. Writing A = ae!*, B = be! , we find that 


|A + Be'®|*? = a* + b? + 2abcos(et + B —a), 


so this amplitude varies periodically with frequence ¢€, giving the phenomenon 
known as beats. 


For ® = wo our solution makes no sense, giving an “infinite” amplitude. A 
specific solution for ®@ = wo might be discovered, analogously to the case of 
critically damped oscillations, by considering the solution 


C = 
x(t) = —,— >, (sin wot — sin of), 
Wo” — WwW 


where the limit as ®@ > wo is 


C 
x(t) = ———t coswot. 
2W0 


300 Chapter 8 


This solution still has disconcerting features, as its amplitude approaches 00 as 
t — oo. This anomaly arises not only because our equations won’t even hold for 





these large oscillations, but also because we have so far ignored the fact that 
physically there 1s always some damping present. 


Damped forced oscillations. We therefore consider the equation 
x'"(t) + 2px'(t) + wo" x(t) = F(), 
where again we are mainly interested in the case 
x(t) + 2px’ (t) + wo*x(t) = C cos ot; 
we can also write this in the more convenient complex form 
x" (t) + 2px'(t) + wo?x(t) = Ce’. 
We now try for a solution of the form x(t) = (ce'®)e'®’. We need that 


C _ C(wo* — &* — 2ip) 


ae a ee, 
Wo* —@%+2ipm  (wo* — 7)” + 4072p?’ 


so, denoting the denominator by A > 0, we have 


= C(wo” — 0*)/A C —2po 
ce on w*~)/ gs. ge p@ 
csing = —2cpa@/A JA 


Wo —@m~ 
The real part of the solution x(t) = ce'%e'® then gives the particular solution 


c cos(@t + @). 


Any ambiguity in determining @ from the formula for tan @ 1s resolved by the 
specific formula for ¢ sin @, which shows that @ < 0 for all @ > 0, with the value 


Oscillations 301 


of @ starting at O when @ = 0, reaching —z/2 at ® = wo, and approaching 
—Il as W—> OO. 





In particular, consider the underdamped case p < wo, and let w= Va" — p?, 
as on page 296. The gencral solution is 


x(t) =e (acoswt + bsinat) + ccos(Ot + @), 


or, if we also write the solution of the homogenous part in terms of amplitude 
and phase, as 


x(t) = Ae°'cos(wt +) + ccos(@t + 9), 


where a and b, or A and ¢, are determined by the initial conditions and c 
and @ are constants. ‘The first term, which dies out as t becomes large, is called 
the transient, while the second term is the steady state solution. Since @ < 0, 
the steady state solution always lags behind the driving force; it is exactly 1/2 
behind when ® = wo, and comes close to being z behind as @ increases. 

If we plot the value of c against @ (for some fixed C), we get a graph like that 


in (a) of the figure below, with the maximum at ®@ = Va * — p*. Part (b) of the 





figure shows the graph we would have for p = 0, as on page 299; the formula 
actually gives negative values for ® > wo, but amplitude is by definition positive, 
so this must correspond to a phase shift of magnitude z, and consideration of 
the damped case shows that we should consider it to be a lagging shift —z, 
rather than a leading one +2. 


302 Chapter 8 


The phase shift can be demonstrated with a pendulum of length /, and thus 
natural circular frequency wo = /1/g, whose suspension point is being moved 
horizontally along AB in harmonic motion with circular frequency @ as in (a) of 
the figure below, since its equation, from Problem 1-18, 1s precisely an undamped 
forced oscillation. In the case @ < wo, once the steady state has essentially been 





(a) (b) (c) 

reached the pendulum bob follows the direction of the suspension point: when 
the suspension point is all the way to the left (A) or right (B), the same is true 
of the lower end of the pendulum (at A’ or B’). This lower ends moves exactly 
like the end of a pendulum having length > / that is attached to a higher 
suspension point, as in part (b). Part (c) shows the case where ® > wo: the 
lower end swings right when the suspension point 1s moving left and left when 
the suspension point is moving right, and moves like the end of a pendulum of 
length < / suspended from a lower point. 


Coupled oscillators. Many physical systems can be thought of as a collection 
of oscillators influencing each other. ‘The possible behavior of such systems can 
be quite complex, even in one of the simplest such systems, consisting of two 





pendulums of the same mass m and pendulum length /, connected by a spring, 
with spring constant «, whose unstretched length is the distance between the two 
pendulums when they are both vertical. Letting K = «/m and w = Vg/I, 
and restricting our attention to a linear approximation of the actual motion, 
as in the case of the spherical pendulum, we can write the equations for the 
displacements x; and x2 of the two pendulums from their respective vertical 
positions as the pair of equations 


x1" + wo°xX1 = —K(x1 — x2) 


x2" + W0*x2 = —K(x2— 1). 


Oscillations 303 


Although there are various mathematical tricks to try for solving this pair 
of equations, it’s really easiest to consider the two physically obvious solutions 








where the pendulums cither move in sync (a), with x; always equal to x2 or in 
anti-sync (b), with x; always equal to —x2. 

Motion in sync naturally leads to identical equations x;” + @o*x; = 0, so that 
both pendulums have circular frequency wo, with 


x1(t) = acosm@ot + bsin wot 


x2(t) = acoswot + bsin wot. 


Motion in anti-sync, x2 = —Xx}, also leads to identical equations, 
x1" +097x1 = —K(x,; —x2) = —2Kx, 
x2" + wW9"°xX2 = —K(x2 —x1) = —2Kx2, 


so that both pendulums have circular frequency 


wo = Vw? +2K (% Wo + om for K < a), 


giving 
x,(t) = C COS Wg t fb dsinag t 


x2(t) = —c cos Wo t —dsin wg t. 


It is to be expected that wy > wo, since the stretched spring speeds up the 
oscillations. 

These solutions, each with both pendulums having the same period, are called 
normal modes of the system. Combinations of the two, 


x1(t) =acos@ot + bsinwot +c COS Wo t + dsinag t 
x2(t) = acosm@ot + bsin wot — c cos wet —dsinaw,t, 


with four arbitrary constants a,b,c, d, will give us a solution (x1, x2) with any 
desired initial conditions for x; (0), x;’(0). 


304 Chapter 8 


In contrast to the special cases we began with, consider the asymmetrical case 
where we start with the second pendulum hanging straight down, and the first 
displaced by a certain amount, 

x1(0) =C, x(0)=0; = x2(0) = 0, x2'(0) = 0. 


These initial conditions give 


x,(0) =C=a+ec x1/(0) = 0 = —bao — dwg 
x2(0) = 0 =a-c x2'(0) = 0 = —bao + dwg , 


so thata =c =4C and b = d = 0, and we have 


eit) = +C(cos Wot + COs we t) 


xo(t) = +C(cos Wot — cos wg t), 


which can be written as 


+ + 
Wo — W Wo + W 
x,(t) = Ccos ee —— 
+ + 
. WO —W . 2 +O 
x2(t) = —C sin mae a ge 


For “weak coupling”, when K /wo is small, and hence wo — Wo 1s small, the first 


factors in the expressions for x;(t) and x2(t) vary slowly with time, and we 
again have beats. What we observe is that as pendulum | begins to swing, the 


pendulum 1 i All| Al Mos ttlimih 


pendulum 2 alo alte A alt 


amplitude of the swing decreases; at the same time, pendulum 2 slowly begins to 
swing, with increasing amplitude. ‘This process continues until pendulum | mo- 
mentarily comes to a stop, at which point pendulum 2 1s now swinging with the 
original amplitude of pendulum | and then the process reverses, until we once 


Oscillations 305 


again have pendulum | swinging with maximum amplitude and pendulum 2 

at rest. Energy is continually transferred back and forth from one pendulum to 

the other, with the total energy remaining the same, except for frictional losses. 
Instead of solvmg our original equations 


x1" + @0°xX1 = —K(x1 — x2) 
x2" + W0?xX2 = —K(x2 — 1), 
on the basis of the two physically obvious solutions, we could have used the 
equivalent mathematical trick of adding and subtracting them to get equations 
for 2) = X1 — X2 and zz = x; + x2, and then expressing the results back in 
terms of x; and Xx. 
With the advantage of hindsight, we might simply look for normal modes to 
begin with. This is the approach we will use for the case were the pendulums 


are not identical, so that we have masses m; and mz and lengths /; and J. 
Setting 


we now have the equations 


x4/(t) + @17x1(t) = —Ki (x1 — x2) 
x2" (t) + @27x2(t) = —K2(x2 — x1), 


and we look for solutions that are the real parts of complex solutions 
x1(t) = Ae’, x9 = Be'**, 
with the same circular frequency w. We then obtain 


A(@17 at(° + Kj) = K,B 
B(w>? = w? + K2) = KA. 
This leads to 


B Op ar AK K> 


, nm IE 


and thus to 


[A? — (wi? + Ky) |[A* — (@2* + K2)] = Ki Ko, 


306 Chapter 8 


which is a quadratic equation in A?, having roots that we will call @17 and @ 2”. 
If ry and rz are the ratios B/A arising from (a) for A? = @;7 and A? = @>?, 
respectively, then we can write the general solution as 

x1(t) =acos@;t + bsinw,t +ccosmo2t + dsinwoat 


x2(t) = ryacos@yt + ribsinw,t + rec cos@zt + rod sin@ot. 
With the same initial conditions as before, 
x1(0) = x1/(0) ='0 x2(0) = 0 x2’ (0) = Q, 


we again find b = d = 0 and 
r2 C. pees ry 
ro —Nry tip ale Se 








a= 


C, 


giving two equations of somewhat different form, 





X1(t) = eas (r2 COS @ 11 —Vj COS @ 21) 





x2(t) = =r r,F2(cos @1t — cOs@2L). 


As before, the second can be written 


2ryr2C | @2—@}1 . @1+@2 
sin ————t - sin ————— 





Xl) = — 
) ro —P] 2 t 

with zeros at t = 27n(w2—@},). But x; 1s not zero at the times when x2 has a 
maximum, so the energy is never completely transfered from the first pendulum 
to the second. 


endulumel | Nn af | vive at AA Wen’ | init wah { 


pendulum 2 altel inal 


Generalization of our considerations to N harmonic oscillators, all interacting 
in a linear way, is easy, mainly because we speak only in generalities. We are 


Oscillations 307 


considering the N equations 
N 


xe (t) + on? xK(t) = Yo agi x1, Ko len dN 
| 


where ax; = ax because of the third law. For a normal mode 
xz(t) = cye!”, a ey Le 
we must have Pe 
2te= Y aerer — Ce@g’, eel siead. 


[=1 
or in matrix form 


2 
Cl —W} () G11... GIN Cj 


re ie a ee 
CN QO) —Wn” QN1...QNN CN 


so that A? is an eigenvalue of the matrix in brackets. Since this matrix is sym- 
metric, it has a basis of eigenvectors with real eigenvalues. Of course, an eigen- 
value A? can be negative, so that A = ib for b > 0, giving solutions that are 
multiples of e?, rather than oscillations. For example, if our original equations 
on page 302 had wo = 1, but K = —1 instead of K = 1 (corresponding to 
a strange spring that keeps pushing the pendulums further and further apart), 
then our formula for @ on page 303 wouldn’t make sense. Thus, there is a 
basis of eigenvectors, with positive eigenvalues each leading to a normal mode, 
and negative eigenvalues leading to exponentials, which we may regard as a sort 
of normal mode also. Since the corresponding (c1,...,cyN) are linearly inde- 
pendent, every solution can be written as a lmear combination of these normal 
modes; some eigenvalues may have multiplicities, leading to normal modes with 
the same circular frequencies or exponential solutions with the same exponent. 
In Chapter 12 we will see that this result obtains for much more general systems. 


The double pendulum. Another sort of coupling is exhibited by the double pen- 
dulum. We will be especially interested in the case where /; = /2 andm2 «K mj, 





where a heavy pendulum, like a chandelier, has a light pendulum of nearly the 
same length suspended from it. As described in Sommerfeld [2], after a sharp 


308 Chapter 8 


blow to the heavy bob “the light bob will be set in vigorous motion, which 
suddenly subsides and stays at zero for a short time. At this instant one per- 
ceives that the heavy bob, which had previously remained practically at rest, 
now starts oscillating with noticeable amplitude. This oscillation soon ceases, 
however, whereupon in its turn the light pendulum again begins to move with 
considerable vigor, and so forth.” 

Once again, we are considering only a linear approximation to the actual 
motion, involving small 6;; the present analysis, involving some additional com- 
plications, may be compared with that to be found at the end of Chapter 12. 
We have the approximations 





: Xj ; XP — X1 
0,~sm6,; = —, 62 & sind. = ——— 
ly Ip 
X2—X] XxX] 
sin (62 = 01) ~ O> =a 01 ~ aii rca 
> l; 


cos6; ~ 1, cos62~1, cos(@; —62) +1. 


The lower pendulum is acted on only by gravity, but the upper pendulum is 
also affected by the tension on the string holding the lower pendulum, which, 
as on page 210, is 

M22 COS 05 m2g17027. 


Following Sommerfeld, we will drop the term O27, as being small of second 
order, since, supposedly, 62 is of the same order as 62 (compare the treatment at 
the end of Chapter 12). ‘The horizontal component of this tension m2g cos 62 
is —Mm2g cos 02 sin(@1 — 62), so we have the equations 


ss § X2—X1 XI 
MX, = —M,—X1 + M28 — — 
ly ( ly ) 





é g 
m2X2 = —M2—(x2 — x1), 
ly 
or, setting “= m2/m, 


X1 + ra ere x1 = bx 
ly ly l, I, 
(*) : : 
Xo + —xX2 = —X}. 

2 ) 


l 
If we now take the case 1; = /2 = 1, and set wp = /g// we obtain 
Xi + wo*(1 + 2[L)X1 = [LW X2 


oe 2 ? 


Oscillations 309 


similar to the equation in the middle of page 305. When we look for normal 
modes | | 
x(t) = Ae’**, x9 (t) = Be'**, 
we now obtain the equations 
B(wo" 1 7) = Aw” 
A(wo" (1 + 2) — A*) = Bua’, 
and, as on page 305, we get 
Bo a” @o* (1 + 21) — 1? 
A 7 Wo? —j2 [Lwo2 
leading to 
(A* — wo’)? + 207 (Wo? — A7) = Wao’, 
a quadratic equation in A? with roots that we will call A? = w? and A? = @2?. 
Writing the solutions of the quadratic equation in terms of ./u, with higher 
powers of ,/j dropped, we find that 


@1, @2 = wo(1 + $./p). 
We now have the general solution 
x1(t) = ryacos@it+r,bsin@,t + rec cos@2t + rod snot 
x2(t) = acos@;t+bsn@,t +c cos@a2t +d sin@ot 
if we define r; and r2 as on page 306, and we have approximately 
Pra a 7h, 2S“ == r2—11 = 2,/p. 
Now the initial conditions, from a sharp blow to the heavy bob, 
x1(0) = 0 x; (0; =C x2(0) = 0 x2'(0) = 0 








lead to 
C ry. ro. 
x, = — sin @;f — — sin@ol 
r; — V2 \@1 @2 
C 1. 1. 
x3 = — sin@,f — —sin@ol |], 
Vr} —V'r2 \@1 @2 


which gives, using our approximate values for r; and ro, 
ee 
“= m (cos @it + cos @ 2) 


x2’ = ——=(— cos@ t + cos@ of), 


2 Jp 
with the velocity of the light lower bob 1/,/u times greater than that of the 
heavy upper bob. Our equations can also be written in a form similar to those 
on page 304 to show the beats and interchange of cnergy. 


310 Chapter 8 


The vibrating string. Returning to the consideration of oscillators coupled by 
springs, we are going to consider a “continuous” example, essentially involving 
an (almost) infinite number of coupled oscillators. In complete generality this 
would encompass the discussion of vibrations in continuous media, but the 
vibrating string, where both the motions and the couplings of the particles will 
be quite restricted, may be regarded as a gentle introduction to such problems. 

We think of a string under tension (e.g., a violin string) as a system of particles 


numbered Po, Pi,..., Pn, Py+1, each of mass m, arrayed along a straight line 
Po Py Po  P3 Py-2 Pn-1 Pn Pwni+i 
o___-@_____@_____- e e ° —_@—_—___-_@—__e—__—__© 
Nee Nee ee” 
h h h 


of length L, with the particles Po and Py +, restrained to a fixed position. 
We think of adjacent particles as being connected by identical springs (a rough 
representation of intermolecular forces) of length h = L/(N + 1), greater than 
the relaxed length of the spring, so that there is a uniform force drawing the 
particles towards each other, leading to the tension t of the string. 

We assume that our particles move only in a plane, and then ask for the possi- 
ble motions when each particle is oscillating by a small amount about its initial 


point. It might seem rather quixotic to approach this question in terms of cou- 
pled oscillators, since Fourier series essentially provides all the information we 
need about vibrating strings. But, aside from the fact that the initial steps are the 
same as those we would take to obtain the proper differential equations for the 
continuous case, there are some interesting differences between the continuous 
case and the discrete case. 

Suppose first that our particles are moving only vertically, with particle Px at 
height uz above the initial horizontal line; we should write ux (t) to indicate the 
dependence on the time ¢, but for the moment we are only considering what 
happens at one particular time. The line from Px, to Px; makes an angle 0, 
with the horizontal, which is small if our vibrations are small (and N is large). 





Oscillations 311 


The first thing we need to note is that the distance from Px to Pry. is now 


h h 
a | he 
cosO, = 1— 50,2 +--- erg ek 





so, up to first order, the distance remains the same, and thus, up to first order 
the tension remains t throughout. 
By the same token, the horizontal component of the force on Px is 


—1 COS Oe + 1 C08 Of = At(O—1* — O47 +++), 


which is 0 up to first order, somewhat justifying our original supposition that 
the particles move only vertically. 

Having fudged our way through these preliminaries, we now consider the 
force on Px, 


Fy = —tsin Og; + T sin Oy 
T T 

~ ——(Up — UpR_ —(u yey. 

7 (Uk k + 7 edi k) 


For the various functions ux (t), this gives 


‘ T T 
t) = ——_(u,(t) — ug_1 (t)) + — t) —ug(t 
(a) Ug (t) ~ 7 ue) Uk 1(t)) + (wes (t) ux(t)), 
which can be written as 
T 
ug! (t) + 2wo7uz (t) = wo* (ux—1(t) + Uxai(t)), wo” = ane 

in the form of the equations on page 307, where now all wy? = 2 9” and 
Ap] = Wo” for! =k +1 and ag; = —wo” for ] = k — 1, with all other ax; = 0. 


Finding normal modes as real parts of 
ug (t) = che’! Co = cn+1 = 0, 
by solving the equations 
(—w* + 207 )ck — Wo" (Ce—-1 + Ck+1) = 0 


or by finding the eigenvalues of the corresponding matrix, does not look like 
a particularly inviting task. If we were clever enough, we might notice the 
following: 


(1) Our equations can be written as 


Ck wo? ji 


so that the left side should be constant for a normal mode. 


312 Chapter 8 


(2) We have 
sin(k — 1)6 + sin(k + 1)@ = 2sink@ cos 0 


or 


sin(k — 1)@ + sin(k + 1)6 
sin k@ 


= 2cos 8, 


so we wil have this constant relation if we take 
cy = sink 


for any 0. Since we want co = Cy +1 = 0, we just need (N + 1)6@ to be 


a multiple of z, leading to 








ee St ae 
N+1 
and thus 
Ck-1 + Ck+1 NI 
——_—$—— = 2cos ; 
Ck Fest 


By (a’), the corresponding solution @ = @» is given by 


ni Wy? 0 we" 
2 COS = 
N+] 


Wo~ 
= 2a 2} 1— aia 
NIT 
—4 DB os ee 
mye sarc)” 


9 F NI 
Wn = 200 51Nn | ———— ] . 
n 0 2(N +1) 


The general real solution of our equations uz (t) = cr e!t for this n will be 





) 


and thus 


knn shee ; 
* Up = sin | ——— ]| lacosw SIN @yt]. 
( n) k N+1 S Wn n 


For the particle Pg, at distance x = kh = kL/(N + 1) from the initial point, 
the sine factor is just 

. (Nmx 

a. 


L 


Oscillations 313 


so for t = 0, the particles all lie on a times this sine curve. In the lowest mode the 





n= 1 

R=? 
N=4 

n= 

n=4 





~~ _ 


particles are all on the same side of the horizontal axis; in the highest mode 
they alternate sides. 

The factor acos@y,t + bsin@nzt = Acos(@nt + ¢) for some A and ¢ then 
indicates how this configuration changes amplitude with time. In the figure 
below, where we have taken ¢ = 0, amounting merely to a shift of the time 








coordinate, we see 7 stages of a half-cycle, with the particles moving from their 
positions at t = 0 to their positions at t = 2/@n. 

If our figure had an additional ¢ axis, one could “see” this motion more 
clearly; over the period T, = 22/@n, the whole configuration would pass 
through the positions 1-7 and then back to position 1. In the previous figure, 
the sine curves and their inscribed piecewise-linear curves are plotted against 
distance, rather than time, and the corresponding “period”, during which a 
curve goes through a complete cycle, is now called the wave length 4. ‘The 
reciprocal 1/A, the wave number, is the analogue of the frequency v = 1/T, so 
the analogue of the circular frequency w = 27v is kK = 27/A, the angular wave 
number. 


314 Chapter 8 


Our solution (*,) has a wave length A, = 2L/n, so the corresponding angular 
wave number is 


An L (N+Dh 


and our formula for w, can be written 


nh 
(A) Wn = 2 = sin("). 


‘To approximate a continuous string of density p, we want to choose m and h 


so that m/h = p, and thus /t/mh = V/t/p-1/h, and we can write 


tT 1. [Kyh 
Wn =2,/-—--—-sin 
p h 2 


We normally think of having N > n, m which case kKyh will be very small. 
Since 1 > sin6/@ > 1 as 6 > 0, we have 


fe dL. kyh LT 
Az = 2 =) ee = ee ) 


and we might expect that we will have exact equality for the continuous case, 
which we want to consider next, naturally hoping that everything will turn out 
to be easier, and we won't have to be so clever. 


20 nit ni 
Kn = — 








In equation (a) on page 311 we consider u(x, ¢) for arbitrary x in an interval 
[0, L], and see what we get as h > 0, or N > ~w, keeping m/h = p. Instead 
of considering particles Pp—1, Px, Pe4i1, we simply consider arbitrary points 
x—h, x,x +he (0, L], and write 


07 u t .. u(x +h,t)+ u(x —h,t) — 2u(x,t) 
—> (x,t) = —- kn 
Ot? p h-0 h? 

But an easy application of ‘Taylor’s theorem shows that 


fey Ue +h) = fe =f) = 2f 009) _ 
h->0 h2 


f(x), 
so we arrive at the classical 1-dimensional wave equation, 

07u 50° 

ae) SD); eee 


Now we have to be clever again, but we only need the standard cleverness of 
“separation of variables’, seeking a solution of the form 


u(x,t) = X(x)T(t). 


Oscillations 315 


We find that X(x)T”(t) = v?X"(x)T(t) or 


T"(t) 2 22) 
T(t)  X(x)" 








So the two sides must be a constant K. The equation v?X"/X = K won't have 
a non-zero solution with X(0) = X(L) = 0 for K > 0, so we can set K = —w”, 


giving 
T’(t)+@°T =0 
X" (x) + = = 0. 
The solutions to the first are simply 
T(t) = acos(@t) + bsin(a@t), 
and for the second we will write 
X(x) = sin (= + ¢) 


From X(0) = 0 it follows that sng = 0, so X(x) = +sin(@x/v), and then 
from X(L) = 0 that w must be one of the numbers 


NIV 
On = > dl OS ee 


for each such @, we have the normal modes 
; nA : 
(ky) u(x,t) = sin (=x) [an COS @yt + by sin @yft], 


for constants dn and by. Naturally, we can use the modes with b, = 0 to work 
backwards to the clever choice (a’) on page 312. 

The Fourier series for any continuous curve f with f(0) = f(L) = 01s an 
(infinite) linear combination of the terms sin 4", so any desired initial condition 
u(x, 0) can be determined by an appropriate linear combination of these normal 
modes by appropriate choice of the ay. Moreover, any desired initial condition 
du/dt(x,0) can be obtained also, by the proper choice of the b,. So we have the 
general solution for our equation with the initial conditions u(0) = u(L) = 0. 

Notice that now the relationship between @, = navu/L and ky = nx /L, with 


v* = t/p, is simpl 
Ply On = VT/P°Kn. 


Comparing with (Ax) on page 314, we would be led to conclude that for a 
physical string, @, should be a little smaller. In actuality, all sorts of other factors 
may come into play. In our model of a string as a collection of N particles, 


316 Chapter 8 


the only force considered was the tension resulting from the increase of distance 
between particles, with no account taken of the force resulting from the slight 
difference between the angles that a particle makes with its neighbors (basically, 
the “stiffness” of the string). For a piano string, this turns out to be proportional 
to Kn*, and the relation between @, and Ky is approximately 


On? =f t/ p> Kn? +akn’, 


for a positive constant a, a measure of the stiffness. So in this case, @, grows 
faster than the idealized case (the “harmonics” of a piano string are a little 
sharp, rather than a little flat). 

Although our analysis of the wave equation followed an obvious path from 
the analysis of the discrete case, a completely different approach 1s also possible. 
With a multiplicative change of coordinates, and t replaced by y, the classical 
1-dimensional wave equation can be written as 


Uxx = Uyy >) 


with subscripts now denoting partial derivatives. ‘This 1s actually the prototypi- 
cal example of a hyperbolic equation, which has the “normal form” 


Uxx — Uyy +.-- = 0 
where --- denotes terms not involving second derivatives. ‘There is an alterna- 
tive normal form 
Uxy +::+ =0, 


corresponding to the possibility of writing the equation for a hyperbola in the 
form xy = 1, which can be obtained simply by considering the function 


ul, =u (Et. 7). 


In the study of partial differential equations, rather than secking solutions for 
functions with prescribed values at the end points of an interval, we more often 
consider arbitrary initial values for a hyperbolic equation on an open interval, or 
even the whole x-axis, and in the case of the wave equation, this alternative form 
gives a complete solution: for the function v just defined, the wave equation 
for u becomes simply 

Ven = 9, 


with the general solution 


v(E,n) = f(E) + a(n), 


leading to 
u(x,y) =v(xt+ty,x-—y)= f(x+y)t+e(x—y) 


for arbitrary functions f and g, and it is not hard (Problem 5) to determine the 


Oscillations 317 


functions f and g in terms of the initial conditions, and thus write out u(x, y) 
as an explicit formula. 
When we write our equation with the additional constant, 


du 07 
a 
or? Ox?’ 


we have 
u(x,t) = f(x+vt)+ g(x —- vb), 


which is the sum of two arbitrary “waves”, the first wave moving to the left with 
velocity v, the second moving to the right with velocity v. 





Although we can indeed express any “standing wave” u(x,t) with u(0,t) = 

= u(L,t) in this way (Problem 6), the “moving waves” defined by this more 
general solution are the ones that play the most important role in all sorts of 
physical phenomenon. Problem 5 provides some information about these waves 
that will be used later on in various places in this volume, but a more complete 
discussion will have to be deferred for now. 


318 Chapter 8 


ADDENDUM 8A 
ABEL’S INTEGRAL EQUATION 


As opposed to calculus of variation problems, where the Euler equations give 
a straightforward way of finding necessary conditions for a solution, with ques- 
tions of sufficiency leading to more involved considerations, most solutions of 
the tautochrone problem demonstrate that the cycloid is a solution without es- 
tablishing that it is the only one. 

In 1823, Abel considered a more general problem, one of the first examples 
of an integral equation. Ifa particle of mass m = 1 starts from rest sliding down 
the upside-down cycloid at a point of height Y, with potential energy gY, then 
conservation of energy shows that when it reaches a point of height y we have 


d 
— = —/2g(¥ — 9). 


dt 


If t(y) is the time at which the particle sliding along the curve has height y, 
and a(y) is the length of the curve from the bottom to the point at height y, 
then 








—y2g¥ —y) = SF = SE 2 TD) 


dt at/dy  t'(y)’ 


so the time ¢(Y ) of descent from height Y to the bottom, y = 0, satisfies 





0 1 0 ,/ d 1 ) a: d 
oy) = | oy) dy _ o'(y) dy 


/ 
fp eee | 
pV elk ims iaiv Jems 


The tautochrone problem involves the case where the function @ is a simply 
a constant, ¢(Y) = T, and Abel considered the more general problem, for a 
given function @, of solving for the function o satisfying 


Y o'(y) dy 
| p(y) = = — 
a) — 


or even more generally 


Y o'(y) dy 


O<n<l. 
Y=)" 


oY) = 


Abel’s Integral Equation 319 


Abel showed that the solution is 


_ snam [* o(Y)dY 
a(y) = = f GY) 


which for the case of n = , or /Y — jy, gives 


1 (* dY)dY 
(2) oy) =— | —- 


Abel’s proof (Abel [1]) is a complex chain of computations starting from the 
I function! (pun intended); nowadays it would be written in terms of the Laplace 
transform of the convolution of two functions. An annotated English translation 
can be found in Smith [1], which also explains how an earlier paper by Abel 
gave a solution in terms of “fractional calculus” (see Miller and Ross [1] for an 
introduction to this subject). 

A simple proof,! sticking to the case /Y — y for convenience, may be given 
as follows. Consider 


Oe -[(f a: 5 es dY 
0 (vay A-—Y 


This is the integral of the function 





o’(y) 


V(¥ — y)(A—Y) 


over a triangular region 


A 


expressed as an iterated integral, first along the y-axis, and then along the 
Y-axis. Reversing the order of integration, we get 


Ao(Y)dY _ aes dY 
a) = aa ‘ =z y)(A—Y) 


' Adopted from Landau and Lifschitz [1], where it is used—though without any mention 
of Abel’s equation—to consider a question about oscillation periods, which appears here 
as Problem 4. 


320 Chapter 8. Addendum 8A 


The inner integral can be calculated to have the value 2, whatever values y 
and A may have.! This means that the right side is simply 2o(A). Finally 
replacing A by y, we get the desired result (2). 


For the tautochrone problem, ¢ = T = 1/./2g, Abel’s solution gives 


_ V2eT f? dY — 2/2eT 





o(y) = 
: hb Go¥7 oP 
V22gT 1 
o'(y) = —— 
mam /y 
Assuming our curve starts at (0,0), we can then write 
ds dx \? 
{ = — ll l — 
70) dy * (3) 
d 
— = Joy) =1 


dy 


Y /2gT? 
pe] 7 —ldy+0 (since x = 0 at y = 0). 
0 wy 


The substitution 





T2 
y = 2asin’ u, q = 2 
1 


changes this to 


0 
bm ta | cos* udu for DV = arcsin / y/2a, 
0 


siving 

x = 2a(0 + $ sin 20) 

y = 2asin’ 0, 
which is a cycloid parameterized by 6 = 20. 
' Numerous approaches are possible, including an elementary integration. Note that 
the substitution u = (Y — y)/(Y — A) reduces the integral to i. du/(u(l — u))2 = 
fo u-2(1 —u)~2 du, and f) u®(1 — u)8-! du = P(@)P'(B)/T (a + B), which is the 


formula that Abel starts from, and which could be used to adapt this proof for the 
general case. 


Envelopes 321 


ADDENDUM 8B 
ENVELOPES 


Envelopes, which have been mentioned in this chapter, will play an important 
role in Chapter 15 and later chapters, so we will give a brief discussion of this 
topic, which is often slighted in modern treatments of differential geometry. 

We'll begin by considering the simple case of a 1-parameter family @ of curves 
in the plane, given by a(u) = t + a(u,t) for some C™ function a: [0,1] x 
(0, 1] > R?. An envelope of this family is defined to be a curve c which is not 
a member of this family but which is tangent to some member of the family at 
every point. Unfortunately, it often turns out that the envelope of a perfectly 


CG Pass eT ‘ (OO 
LET 
HS oo 


—> 


~~ 














nice family of curves has a cusp or something worse, as shown below, but we 
won't be worrying too much about this. 





The classical way of finding the envelope of a was very geometric. For each u, 
we let c(u) be the limit, as e —> 0, of the intersection of a@(u) and a(u + €): the 


apy 


envelope consists of the “intersections of members of the family with another 
member infinitely close to it”. The picture below shows that this idea can run 
into some serious difficulties. Nevertheless, it often works out rather well in 


envelope of a 


family of cubics ~~ 


particular cases, and even in the general case it leads us to the proper analytic 
condition, when we argue as follows. 


322 Chapter 8. Addendum 8B 


Let us consider first the case where our curves a@(u) are all expressed as 
the graphs of functions; thus there is a function (u,x) +> f(u,x) such that 
a(u,t) = (t, f(u,t)). Suppose that the curve a(u) and the curve a(u + h) 


y a(u +h) 


a(u) 


intersect at the poimt 
(xn, f(U, Xn)) = (Xn, flu +h, xp)). 

Then we have 
— SU+N, xn) ~— FU, Xn) 
= ; ; 
Assuming that x, approaches a number x(u) as h — 0, we find that x(w) must 
be a point for which 
(*) D, f(u,x(u)) = 0. 
If we find the points x(u) for all u, then the envelope should be the curve 
consisting of all points (x(u), f(u, x(u))). 

If we are given a general family a, not necessarily expressed as graphs of 
functions, then we can introduce the function f in two steps. We first determine 
t(u, x) so that 


0 


(1) aj(u,t(u,x)) =x, 

and then define 

(2) flu, x) = a2(u, t(u, x)). 

Then equation (*) becomes 

(3) 0 = Dya2(u,t(u,x)) + Doag(u,t(u,x))- Dyt(u, x), 


while equation (1) gives 
Dy,a,(u,t(u,x)) + Drea, (u,t(u,x))- Dyt(u, x) = 0, 


eGR ay) 
Dea u,t(u,x)). 





Dit(u,x) =—- 


Substituting this into (3), we obtain 
[Dia2-De2a; — Dja,1- Dga22\(u,t(u,x)) = 0. 
Thus we find that the envelope should consist of points a(u,t) where (u,t) 
satisfies 
(>) det (Dja;(u, t)) = 0; 


Envelopes 323 


Now even without resorting to the motivating geometric construction, it is 
clear that if there is an envelope of the family a, then it must be a subset of the 
points a(u,t) for which (u, t) satisfies (**). For, if the determinant in (**) is non- 
zero, then @ is an immersion at (u,t), and the curves a@(u) form a foliation of 
a neighborhood of a(u,t); consequently, the only curve through a(u,t) which 


" 


is tangent to some curve of the family at each point is a@(u) itself, which means 
that a(u,t) cannot be a point of an envelope. Problem 2 (d) gives an example 
of the use of this criterion. 

Envelopes of a family of surfaces in R?, which will be our main interest later 
on, are obviously best handled in this general way, and we will basically need 
the concept, rather than any particular methods of calculation. The special case 
of the envelope of a family of planes in R? is discussed in DG, Vol. 3, Chap. 3, 
Addendum, from which most of the foregoing material was taken. 


324 Chapter 8 


ADDENDUM 8C 


STABILITY OF SOLUTIONS 
OF DIFFERENTIAL EQUATIONS 


To describe the question of stability for solutions of a first order equation, 
we'll begin by discussing the 2-dimensional case, which is the one we will apply 
in Addendum 10A, in order to draw some illustrative pictures, merely indicating 
the situation, without giving proofs, for which a reference is given at the end. 

So we consider a first order equation 


(c1’, €2')(t) — (f(cr(t), c2(t)), g(ci(t), c2(t))) 


corresponding to the vector field X with components (f(x, y), g(x, y)) at (x, y), 
where the integral curves near a 0 can have numerous different arrangements. 


ZY 
- G 


In order to get information about the integral curves of the vector field X 

near the 0 point, we “linearize”, by considering the Jacobian matrix 
is ig 
of/dy dg/dy 

at the point in question, which we'll consider to be the origin, for convenience. 
The determination of stability will not be affected by a linear change of coordi- 
nates in the plane, which will change A to a conjugate BAB™", so it’s useful to 
examine how the possible canonical forms for A correspond to pictures of the 
integral curves. 

If the characteristic polynomial of A has a double root 1, then we have two 
possible canonical forms. The accompanying pictures show the case A > 0, 


(0 3) (0 3) 


while the arrows will be reversed for A < 0. 


Stability of Solutions of Differential Equations 325 


For the first canonical form, our equation is approximated by 


Cy’ = dX 0 C1 

Cy’ 7 0 A C2 : 
having solutions with components ae*’, be**. The solutions near the 0 solution 
at (0,0) will have the basic characteristics of this solution: if A > 0, then in time 
1/A they increase by a factor of about e, so the 0 solution is not stable; on the 


other hand, it is stable for A < 0. 
For the second canonical form our equation is approximated by 


cy’ _ Xr l Ci 
Co! =: 0 A C2 
— [Aci + c2 
— AC? ; 
having solutions with components ate*! + be*', be*“, and again if A > 0, then 
one component of solutions near the 0 solution increase by a factor of about e 


in time 1/1. 
When there are distinct real roots A; and Az, we have the following pictures, 


0<1, <A> An <O0<A, 


with the arrows reversed when the signs of A; and Az are reversed. In the second 
case (a “saddle point”), either cy or cz will involve a positive exponential, so that 
the 0 solution is always unstable. 

When A is singular (A = 0 for a double root, or at least one of A; and Az =0 
for distinct roots), no conclusion is possible. 

When there are complex roots u + iv and u —iv, the canonical form is 


& :) u>0O 
—v U 


326 Chapter 8. Addendum 8C' 


with the arrows in the picture reversed for u < 0. Solving the equations 
(cy’, C2’) = (ucy + vC2, —vCc1, + UCc2) by setting 


2S C7 Pics SS 2] Ui v2 = eaHerte **. 


we have (c1(t),c2(t)) = e“!(cos vt, sin vut), and the 0 solution is unstable for 
u > 0, and stable for u < 0. 

The case u = 0, where we have pure complex roots iv and —iv, is again 
indeterminate. Although we might have a picture with closed curves, the curves 


ee} 


could also spiral outwards. Since our approximating equation is cj’ = vCo2, 
c2/ = —vc,, with solutions c(t) = sin(vt), c2(t) = cos(vt), when we do have 
closed curves their period is close to 27/v. 

The general result, in n dimensions, 1s that stability is assured if the real parts 
of the eigenvalues of the Jacobian matrix at the 0 of the vector field are all 
negative, and instability will always occur if the real part of any eigenvalue 1s 
positive. A modern proof may be found in Palais and Palais [1; pp. 47, 53ff], 
where the result is first proved for linear equations and then extended to the 
general case by approximating to a linear equation. 


Oscillations 327 


PROBLEMS 
1. The parameterization of the upside-down cycloid shows that 


dy dy | ax — sin 8 


ae. 6° a6. 1088 


or, for rotation from the bottom, so that we replace 6 by z + @, the tangent line 
makes an angle @ with 

, sin 0 

ana = ————., 
1 + cos@ 


Check that a = 6/2, though it is not so clear how one would notice this! 


2. We described the involute of a curve in terms of pulling out a string wound 
around the curve, and the actual definition is simply a translation of this descrip- 


t(s) 


c(s) 
&(c)(s) 





c(0) 


tion. If c is a curve parameterized by arclength s, with unit tangent vector t(s), 
we define &(c), the znvolute of c, by 


&(c)(s) = c(s) — st(s). 


The curve c really has infinitely many different involutes, depending on which 
point we choose as c(Q); they are “parallel” curves, intersecting the tangents at 
a constant distance from each other. 


(a) Show that for y = &(c) we have (y’(s), t(s)) = 0, as drawn in the picture. 
Note that s is generally not the arclength parameterization for y. 


(b) For the cycloid 


x =6@-—sin@ 
y = 1-cos9, 
(taking a = 1 for simplicity), check that the tangent vector has length 2 sin(6/2). 


Choosing an initial point so that we simply have s(@) = —4 cos(@/2), check that 
the involute is another cycloid of the same size and shape. 


328 Chapter 8 


(c) Although not relevant to the discussion of the cycloidal pendulum, another 
property of involutes may be mentioned. Recall! that for an arclength parame- 
terized curve c with tangent vector t and normal vector n, the osculating circle 
of c at s is the circle tangent to c at c(s) with radius 1/«(s), where k = +|c’’(s)| 
is the curvature; the center of the osculating circle is thus at c(s) + n(s)/k(s). 
We define &(c), the evolute of c, as the curve traced out by these centers, 


‘'<— osculating circle at « 





G(c)(s) = c(s) + Det ~~ evolute of ellipse 


osculating circle at e 


For y = &(c), consider E(y) = E(X(c)), defined by 





G(N)(6) = ys) + 2 
Y 
= c(s) — st(s) + maa 
Y 


Show that this is sumply c(s) (remember that s is not the arclength parameter- 
ization of y), so that we have 


E(A(c)) =c. 


Of course, it is great fun to try to prove all this geometrically, as Huygens did, 
with the center of the osculating circle defined as the limiting position of normals 
near a point, or alternatively, with the evolute defined as the envelope of the 
normals to the curve. 

(d) For an arclength parameterized curve c(s) = (ci(S),c2(S)), with normal 
n(s) = (—c2"(s), c1'(s)), use (**) on page 322 to show that the envelope of the 
normals consists of points c(s) + n(s)/K(s). 


3. (a) Multiplying the equation x” + w*x = 0 by x’, conclude that 
x’? 4+ w*x* =2E 


for a constant E. 
(b) Taking the initial conditions x(0) = a, x'(0) = 0, deduce that 


o = wva’— x’, 


| E.g., DG, Vol. 2, Chap. 1. 


Oscillations 329 


and then that 


: a IT : Tt. 
wt = arcsin (- = =| —=>_—«éX =asin (wt =F =) =acosat. 
x. 2 2 
4, In Addendum 4B, we considered the case of one-dimensional motion with 
a potential energy function U [the V that arose in that Addendum], so that we 

have the equation of motion 


1/2 = 
5x —U(x)= E. 


If U has a relative minimum point, where the energy is Eo say, then for a range 
of E > Eo, the solution with energy EF will oscillate between two values x;(E) 
and x2(E), and the period P(£) of this oscillation is given by 


x2(E) dF x2(E) dx 
(%) P(E) -| ie x | eee, 
x|(E) x1(E) VE —U(x) 


(a) For convenience, assume that the minimum value of U occurs at x = 0, 
and has the value U(0) = 0; the functions x; and x2 are then the inverses of U 





x1(E) x9(E) 


restricted to the positive and negative x axes, and for the above integral the 
substitution y = U(x) can be written as x = x2(y) for x > 0 and x = x1(y) 
for x < 0. Conclude that 


E Al 
d 
P(E) = v3 [ pNEy for 0 = X27’ — x1. 

0 ~E-y 
(b) Use Abel’s result from Addendum A to show that 

ec. J ¥ P(Y)dY 
== y — TT, 
. J2n Jo Vvy—Y 


(c) Conclude that we can choose the shape of the graph of U arbitrarily for 
x > Q, and then determine the shape for x < 0 so that (*) holds for the given 
function P. 


330 Chapter 8 


(d) For a symmetric graph, U(x) = U(—x), we must have 


Pe ern es J] ¥ P(Y)dY 
aa ee oe | Ye on 








aay (a) For the solution 
u(x, y) = f(x + vt) + g(x — vt) 
on page 317, with given initial conditions 


u(x,0) = o(x) 
ur(x,0) = (x), 


find equations for f(x) and g’(x), and then for f(x) and g(x) in terms of ¢ 
and integrals involving y, and conclude that we have d’Alembert’s formula 


u(x,t) = —_——_——_+ 


d(x +vut)+o(x—vt) 1 xt 


—vt 
(b) Consider the wave equation only for positive x and f¢: 


es Va) ae x,t >O 


with initial conditions 


u(x,0) = d(x), uz(x,0) = w(x) forx >0 
u(0,t)=0 fort > 0, 


where $(0) = 0 = w(0). Suppose we extend @ and w as odd functions on R. 
Show that the solution is again given by d’Alembert’s formula. 


6. For a solution u(x,t) = f(x +vt)+ g(x —vt) of the wave equation, we have 
u(x,0) = f(x) + g(x). 
(a) If u(0,t) = O for all ¢, then g(x) = — f(—), so that 


u(x,t) = f(x + vt)— f(-x + vt). 


(b) ‘To obtain 
u(x,0) = f(x) — f(x), 


Oscillations 331 


we can take f(x) = U(X, 0) on [0, L], and then use this equation to define f 
on [—L, O]. 


(eo e— Su(x, 0) 





(c) If we extend f to be periodic, with period 2L on the whole line, then we 
also have the other boundary condition u(L, t) = 0 for all t. 
If we start with the function f shown below, and then extended it in this way, 





and then define g, then we can trace the movement of f left and g right. Since 
the actual graphs of f and g would both be solid lines, it would look as if the 
two waves were being reflected from 0 and L. 





(d) Apply this to the two special cases where u(x,0) is either sin([n2/L]x) or 
cos([nz/L]x); these special cases, which might have been noted independently, 
then imply the result in general, by using the Fourier series expansion. 


CHAPTER 9 
RIGID BODY MOTION 


Dee was one of the first to analyze the motion of rigid bodies in any detail, 
and he wrote his equations in terms of a rotating, non-inertial, coordinate 
system. Aside from the use of such coordinate systems in the study of rigid body 
motion, the basic preparation will also serve as an introduction to the material 


in Chapter 10. 


Rotating coordinate systems. For a one-parameter family of rotations B(t) of 
R?, at time ¢ we can consider the orthonormal basis (u; (¢), U2(t), u3(t)) defined 
by u;(t) = B(t)(e;) for the standard orthonormal basis (e;, €2, €3). ‘These can 
be taken as the unit vectors of a new coordinate system at time f, so that in this 
way we obtain a “rotating coordinate system”. 

Now given a curve r in R?, with 

r=/fr,:e; + ’o:@2 + 73°63, or simply r = (71, 12,73), 
we want to consider the components of this curve when written with respect to 
this rotating coordinate system, that is we want to write 
(a) r=p1-°Uy + P2-U2 + £3-U3 
= p;- B(e1) + p2- B(e2) + p3- B(es), 

where we are using abbreviated notation: B(e;) stands for tre B(t)(e;). 

An observer rotating along with these coordinate systems will regard these 
rotating coordinates as simply being a standard set of coordinates, and for such 


an observer the p; are the coordinates of r in this standard set of coordinates, 
and r will simply be described as the curve 


(b) p = (01, 02,3) € R°. 


‘To relate these two descriptions, we simply note that comparison of (a) and (b) 
gives the following equation in R?°, 


r = B(p), again using abbreviated notation. 


Continuing to use abbreviated notation for convenience, so that, for example, 
B(p’) stands for t +> B(t)(p’(t)), we now have 


r’ = B(p’) + B'(p) 
= B(p’) + B'B'(r), 


332 


Rigid Body Motion 333 


and if we introduce w, giving the components of the skew-symmetric matrix 
B’'B™', as on page 186, we can write 


(1) r’= B(p')+ @xr. 
To interpret the term B(p’), we note that since 


p. = (p1', p2', 3 ) = pr “ei + p2" -e2 + p3" °€3, 
we have 
B(p’) = py’ «Uy + 2’ + U2 + 23’ - U3. 


This means that B(p’) is just what the observer would compute for the derivative 
of the curve by taking the components of r in the rotating coordinates, and 
then simply differentiating these components, what one might call the “rotating 
observer’s derivative”. We will denote it by r’, so that we can write 


(2) r’=r+ wxr. 


Thus, the derivative r’ is the sum of the “rotating observer’s derivative” r’ and a 
correction term w xr. [Physicists often write something like ns = or + @xr, 
with £ indicating the operation where only the components in the rotating 
coordinate system are being differentiated. | 


The Euler equations. In Chapter 10 we will consider rotating coordinate sys- 
tems in general, but in this chapter we are mainly concerned with a rigid body 
rotating about some fixed point, and we will usually take the rotating coordinate 
axes to be the principle axes of inertia of the body at all times, so that the of 
equation (1) is the same as the angular velocity w of the rotating body. In this par- 
ticular context the rotating observer’s derivative r’ is often called the “body de- 


rivative” [and physicists often write something like (5) sauce = (5), ser? xr]. 


In particular, we can now apply equation (2) to the angular momentum 
curve L of our rotating body to get 


(E,) real + oxkh. 
When there are no external forces, so that tT = 0, we then have 
(E) L>=Lxo. 


Since the rotating coordinate axes are now the principle axes of inertia of 
the body at all times, the components L,, L2, L3 of the vector L will just be 


334 Chapter 9 


W111, @212, 0313, where the constants 1/1, /2, /3 are the principal moments of 
inertia. When there are no external forces, by taking the components of our 
vector equation (E) in the rotating coordinate system we obtain the standard 
form of the Euler equations: 


Toy’ = U2 — I3)o203 
(E) In@2' = (13 — 11)@304 


I3@3' = (1 — In)@102. 
For the case of external forces, equation (E,) yields corresponding equations 


™% = To; + (3 — I2)a203 
(Ez) T2 = Inw2' + (1 — I3)@301 


t3 = I303' + U2 — 11) @102. 


As a particular consequence of the Euler equations (E), we can reprove the 
result on page 193 concerning rotation of a rigid body about an axis when there 
are no external forces, though we must assume (as proved on page 192), that 
the angular velocity is constant: This means that the @; are all constants, so 
that the right hand side of each equation of (E) is 0; consequently, if the J; are 
all distinct, we have 0 = @203 = ©3@1, = @1@2, which means that if one of 
the @; is non-zero, the other two must be zero, and thus our rotation is around 
the principle axis corresponding to the non-zero @;j. 

We can obtain more sensitive information by using conservation of angular 
momentum L, and in particular of its norm L = |L|, together with conservation 
of kinetic energy T, writing L and 7 in terms of the components L; = a; J; 
of L. From the equation for Tyo on page 194, 


(T) 2T = (I(@),@), 


we have 


Ly Ls. Ls? 
0) 1 a Io [3 


(2) PSL eis 41. 


So L hes in the intersection of an ellipsoid and a sphere (in the rotating coordi- 
nate system). ‘Io be specific, let us say that 1; < Iz < 13, so that the semiaxes 


V2T I; < V2T I> < V2T 13. 


of the ellipsoid are 


Rigid Body Motion 335 


Then (11, L2, L3) hes on the ellipsoid that appears in the figure below as 
the surface x*/I, + y?/Io + z7/I3 = 2T. Part (a) of the figure shows the 


smallest sphere, of radius 27/1, that intersects the ellipsoid. ‘The intersection 
consists of the two points at the end of the smallest axis, and as the radius of 
the sphere is increased we obtain a family of curves, a few of which are shown. 


Similarly, (b) shows the largest sphere, of radius 27/3, intersecting the ellipsoid 


2 


” EY 





(a) 


in the two points at the end of the largest axis, with some of the curves obtained 
as the radius of the sphere is decreased added in. ‘The situation for the sphere 
of radius /2T I> is completely different, however. If we set L? = 2T/> in (2), 
we can solve (1) and (2) to get 


h-hh 
iy Coe Pe 





L3? Li? =a? L17, say. 
This means that the intersection of the ellipsoid and the sphere is the same 
as the intersection of the ellipsoid with the planes z = stax, and thus two 
ellipses (c), with the remainder of the surface of the ellipsoid accounted for by 
the intersections with the curves from the first two families. (In the symmetric 
case when two moments of inertia are equal, the situation is rather different: if 
I, = 12, say, then the intersections are simply the circles parallel to the z-axis.) 

As an application of all this information we note the following. If our body 
is rotating about an axis with moment of inertia J; or /3, and is then nudged a 
bit, the path of L will be changed from its constant path to one of the nearby 
small paths; more precisely, since L might change, we should say that its path 
will change to a small path on a nearby ellipsoid. So the body, though no longer 
rotating about an axis, will still stay close to its original position, and in this sense 
rotations about these axes are stable. On the other hand, if it is rotating about 
the axis with moment of inertia /2, then the slightest nudge will send it onto a 
path that rapidly moves away (except in the very special case that it moves onto 
one of the two ellipses). Problem 3 gives an analytic treatment. 

Our result can be demonstrated effectively with a rectangular solid of three 
substantially unequal dimensions, like a filled match box of the sort used to hold 


336 Chapter 9 


camping or kitchen matches. Whatever the initial position of the box, it is easy 
to get it to spin (a) around the shortest axis as it falls, this axis being the one 
with the largest moment of inertia. Similarly, although a bit more care may 


> 
TEED ‘| : 


axis of largest axis of smallest axis of intermediate 
moment of inertia moment of inertia moment of inertia 


let 


a) (b) (c) 


> 


iN 


be required, the box can be made to spin (b) around the largest axis, with the 
smallest moment of inertia. On the other hand, attempts to get the box to spin 
around the other axis (c) almost always result in an unwieldy tumbling motion. 

For those who are sports maniacs, rather than pyromaniacs, stability about 
the longest axis can be demonstrated more sportingly with a tennis racket, as 
you jauntily throw it into the air, with a well-executed spin, and deftly snatch it 
back as it falls with its axis of spin still pomting in the same direction. 


Poinsot’s geometric description. For the general rotating body with one fixed 
point, in the absence of external torque, we can use equations (1) and (2) on 
page 334 to express @; and w3 in terms of 2, and substitute into the middle 
Euler equation to get an equation for wy’, leading (Problem 1) to solutions in 
terms of elliptic integrals, but they are not especially revealing. A description 
due to Poinsot, who preferred geometric constructions to analytic equations, is 
sometimes invoked instead. 

As a body rotates, the various vectors @ (t) satisfy the equation (T) on page 194, 
or in the form on page 334, 


2T = (I(@), @), 


which means that the vectors w(t)/ /2T lie on the inertia ellipsoid. Since the 
equation of the inertia ellipsoid in body coordinates can be written as 1 = 
Ix? + Iny* + 132” = F(x, y,2Z), say, the normal at such a point on the inertia 
ellipsoid has the directions of the vector (dF /0x, 0F/dy, 0F/0z) at w(t)/ J/2T, 
and is thus a multiple of 


(21a, (t), 212W2(t), 213@3(t)) = (2L,(t), 2L2(t), 2L3(t)), 


Rigid Body Motion 337 


so the normal at w(t)/V2T is parallel to L. ‘This means that in our standard 
coordinate system, the normal of the inertia ellipsoid at w(t)/V2T has the 
constant direction of L, so the tangent plane at any w(t)//2T is perpendicular 


inertia ellipsoid 





normal: 





to L. Moreover, remembering that L = I(@) (as on page 188), we find that the 
distance from the fixed point to this plane is 


(@(t)/V2T, L) — (w(t), (w(t))) 2T J2T 


L L0F TOF 2 
Since T and L are constants, these planes are all the same plane, known as the 
invariable plane. 
In other words, as our body moves, its inertia ellipsoid revolves around the 
fixed point O in such a way that at time ¢ the point w(t)/V2T is tangent to 





(a) (b) 


the invariable plane at some point /(¢), as in (a). In addition, condition (3) of 
the Proposition on page 222 holds for B(t) [with the C of condition (3) being 
the cross-product with w(t)], so the ellipsoid is rolling along the curve h; more 
conveniently, as in (b), we can simply visualize a cone rolling about the fixed 
point O. 

Poinsot [1] called the curve t & w(t)//2T the polhode of the moving body, 
from the Greek a6Ao¢ = pole, terminal and o60¢ = road, route. The figure on 
page 335 shows the ellipsoid (I(@), @) = 2T, a multiple of the mertia ellipsoid, 
with the paths of L; these are not simply multiples of the polhodes, but since 
Li = w; li, the polhodes have the same general arrangement on the imertia 
ellipsoid (see also the remarks at the beginning of Addendum B). 


338 Chapter 9 


Poinsot called the path h traced out on the invariable plane the herpolhode, 
so that one can intone: The polhode rolls without slipping on the herpolhode 
lying in the invariable plane. ‘The name “herpolhode” was used to suggest a 
snakelike appearance, as Poinsot drew it in one of his diagrams, like the one 
in (a) of the figure below. Actually, as in (b), the herpolhode, when viewed 





(a) herpolhode a la Poinsot (b) an actual herpolhode 


as viewed from above 


from above, is always concave with respect to the point P at which L intersects 
the invariable plane—the interesting, but computationally complicated, proof is 
outlined in Addendum B. After the inertia ellypsoid has rolled over the entire 
polhode, which is a closed curve, we are back to the initial conditions, except 
that the body has turned around the L axis, so the herpolhode is made up of 
repeating sections, Just like the orbits studied in Chapter 4; in the above figure, 
the pomt p comes from a symmetry point of the polhode, and the two “loose 
ends” correspond to the symmetry point opposite it. Ifthe angle through which 
the body turns happens to be commensurable with 27, then the whole motion 
of the body will be periodic; otherwise, the herpolhode will fill up a dense region 
of an annulus around P. 

Poinsot’s result has been used to provide various elaborate descriptions of the 
rotation of a rigid body about a fixed point in the absence of external forces, 
but the only one likely to be mentioned nowadays 1s for the special symmetric 
case where we have /; = /2, say. In this case the polhodes are circles, and the 





herpolhode for any particular rolling motion is a circle in the invariant plane, 
and the axis of the inertia ellipsoid revolves at a constant rate around L. 


Rigid Body Motion 339 


The free symmetric top, in body coordinates. Mechanics books usually refer to 
a body with J; = Jz as a “symmetric top”. By “free” we mean that there are no 
external forces, unlike the more familiar “heavy top” of everyday experience, to 
be discussed later, where the external force of gravity creates a torque around 
the point on which the top spins. Before giving an analytic treatment of the free 
symmetric top mirroring the geometric analysis given by Poinsot’s method, we 
will first consider the analysis in body coordinates. 

For the free symmetric top, we immediately get w3’ = 0 from the third Euler 
equation, so w3 1s a constant, and the first two equations thus can be written in 
terms of the constant 


W3 [3 = I 
| _ ests = h) 
ly 
as 
/ 
@, = —Aw , wW, = Acosit 
; % with the obvious! solutions : 
wr = ha, @. = AsmAt 


for a constant A (more generally, we should write At + b). This shows that 
rotates around the symmetry axis with a constant angular frequency A. Since L 
has the components L; = J; aj, it follows that L rotates in a similar way. 





I,< 1, I3> 


Euler applied these results to the earth, which 1s an oblate spheroid with 


to predict the Euler precession of the “geometric North Pole”, where the axis 
of rotation of the earth intersects its northern hemisphere, about the “celestial 
North Pole”, which we may take as the axis through the center of the earth 


' Disdaining the obvious, one can differentiate the first equation to get a second order 
equation for w,, which will be one for harmonic motion; or one can write (@; +i@2)’ = 
iA(@, + i@2), with the solution @; + @2 = Aexp(iAt). 


340 Chapter 9 


pointing to the “fixed stars” (the axis from the center of the earth to the North 
Star will do).! The angular velocity should be given by A on page 339, 


I EY 5 8 4.5, 101 


i= LW, LW, 
aay 300 ~ 300 


since @3 1s practically the same as ||. 


If we use the day as our unit of time, then |w| = 27, so the period should be 
~ 21 /X = 300 days. 


Euler, using the measurements available in 1755, predicted a period of 355 days, 
but searches by several astronomers for motions with a period close to this 
were unsuccessful. In 1891, Chandler, looking for motions with possibly quite 
different periods, reported a motion with a period of about 14 months, now 
known as the Chandler wobble. This was initially received with considerable 
scepticism, until the difference was explained as due to the non-rigidity of the 
earth, with, as usual, several other phenomena eventually adding even more 
complications to the whole picture. 


The free symmetric top, in inertial coordinates. ‘The body coordinates are the 
obvious choice for examining the Euler precession, but for observing everyday 
objects we want to consider the symmetric top in inertial coordinates (we can 
consider a top in space, or a top thrown into the air, so that the gravitational 
force merely causes the whole top to descend, or an arrangement like that of a 
gyroscope, cf. page 355 ff). Although one can presumably derive the view in 
inertial coordinates from the view in the body coordinates, it’s much easier to 
note directly that the equations 


L 


@ 


I,(@1 UW, + @2U2) + 130303 


W,U, + @2U2 + W303 


lead to L = jw + (3 — 11;)w303 = 1;(@ + Aus), and thus 


L 
= — — Aus. 
(a) @ i U3 
Remembering that u3’ = 0, we have 
u3 = - xu 
a= 7 3 


and thus u3 is rotating around the fixed vector L with frequency L/1;. ‘This 
is known as the regular precession of the top. Since (a) shows that w, L and u3 


| This precession is totally distinct from the astronomical “precession of the equinoxes”, 
with a period of 26,000 years, that we will mention later. 


Rigid Body Motion 341 


are coplanar, it follows that » also rotates around L with frequence L/J; (see 
Problem 2 for the reconciliation between this frequency and the frequency A 
for the body coordinates). 


L L 


U3 


I3n< I>] 


The view in the body coordinates and in the inertial coordinates are some- 
times combined, for maximal confusion, in a figure showing the “space cone” 





around L swept out by w, together with the “body cone” that @ sweeps out 
around the symmetry axis in the body coordinates. Of course, in our inertial 
system the body cone will be moving around L also; in fact, since @ is the 
instantaneous axis of rotation of the top, the body cone 1s rolling on the station- 
ary space cone without slipping Gnvoking an appropriate interpretation of the 
Proposition on page 222). Note that for /3 > /,, the body cone is rolling on 
the space cone now situated inside it. 


Euler angles. Our analysis so far has used the fact that the configuration space 
for a body with one fixed point is SO(3), without recourse to any specific co- 
ordinate systems. ‘Io go further, however, we will need to introduce such a 
coordinate system on SO(3), also due to Euler. 

In the figure below, part (a) shows the standard x-y-z axes, part (b) the axes 
X-Y-Z that they are taken into by some rotation, and part (c) the two together. 


Z 





342 Chapter 9 


The following enlarged version of (c) shows the intersection of the (x, y)-plane 
and the (X, Y)-plane, known as the line of nodes, a term borrowed from astron- 
omy (cf: page 565). This diagram, of course, presumes that our rotation is not 





line of nodes 


simply a rotation around the z-axis. We can now specify our rotation by 3 
coordinates: the counterclockwise angle ¢@ of rotation about the z-axis that 
takes the positive x-axis to one ray of the line of nodes; the angle @ of rotation 
about the line of nodes that takes the positive z-axis to the positive Z-axis; and 
finally, the angle w of the rotation about the Z-axis that takes our ray on the 
line of nodes to the positive X-axis.' For these Euler angle coordinates to be 
well-defined, we must take 6 € (0,7), and ¢,y € (0,27). 

Thus, a triple (¢,0,y) determines a rotation B, namely the rotation that 
takes the standard x-y-z axes to the X-Y-Z axes having these values. If we let 
Z¢, Xo, Zy be the rotations with the followmg matrices 


for Z¢ for X¢ for Zy 
cos@ sing O ] 0 0 cosy sinyw O 
—sing cosd 0 0 cos@  siné —sny cosy 0 
0 0 | O -—sin@ cos 0 0 ] 


then we claim that the rotation B can be written as the product 
(*) B = ZyXoZq¢. 


In fact, after Zg, a rotation about the z-axis, the new first axis is the line of 
nodes, and Xg thus gives a rotation about this line, moving the z-axis to the 
new third axis Z, and Zy then describes the rotation about this axis. 


! Unfortunately, many variants appear in the literature; @ and y may be interchanged, 
angles may be measured in different directions, etc. 


Rigid Body Motion 343 


Functions ¢(t), 6(t), w(t) determine corresponding rotations B(t), and the 
components of the skew-symmetric matrix B’B~'!(t) then give us the vector 
w(t). We would like to determine these components directly in terms of $(¢), 
O(t), w(t). The easiest way to do this is to first express w in terms of the triad 


e€, = unit vector along the third axis z, —<f/ 
env (t) = unit vector along the line of nodes, - | 
ez(t) = unit vector along the third body axis Z, LOT 


since, after all, these are the axes around which the angles (¢,6,) are mea- 
sured. Note that 

(i) If 6 = w = 0, with only ¢ changing, then B(t) simply involves a rotation 
about the z-axis, so w(t) will simply be $’(t) - ez. 

(ii) If @ is constant and y = 0, with only @ changing, then B(t) simply involves 
a rotation about the line of nodes, so w(t) will simply be 6’(t) - ew (t). 

(ii) If @ and @ are constant, with only W varying, then the matrix B(t) simply 
involves a rotation about the Z-axis, so w(t) = w’(t)-ez(t). 


On the basis of observations like these, the following result 1s usually consid- 
ered to be obvious (cf. the note at the end of the proof). 


1. LEMMA. The decomposition of @ 1s 
w(t)=¢'(t)-e, + O'(t)-en(t) + W(t) -ez(t). 
PROOF. We have 
BUt) = Ly@XomZoa = ¥OOWM*(), say, 


and the components of @ are, up to sign, the off-diagonal components of the 
matrix of 
B'B"' = (WO%)'(WOo)" 
= (WO + O'S + W'O%)o'O CW 
= (WO)(%’o')(WO)!' + WO’O')W! + Ww! 


The third term involves rotations with ¢ and @ constant, since W is our final 
rotation. Observation (111) shows that this provides the proper W’ term. 


344 Chapter 9 


The second term involves rotations with @ constant, since the final rotation V 
doesn’t involve ¢. Moreover, it also involves rotations with y = 0 because of 
the conjugation by W. This provides the proper 6’ term, by observation (ii). 


Similarly, the first term involves rotations with 6 = w = 0, because of the 
? 


conjugation by W®, providing the proper ¢’ term, by observation (i). ¢% 


Note: ‘The idea behind this proof is encapsulated in Problem 4, showing that in 
certain cases “angular momentum vectors can be added”, when this statement 
is properly formulated. 


Now we just have to express ey and e, in terms of ey and ey, the unit 
vectors along the body axes X and Y. From our diagram for the Euler angles 
we easily see that 
(e1) env = (coswh)ex — (sin Whey. 


The Z component of e; is cos 0, while its component in the (X, Y )-plane, with 
length sin 8, can be decomposed with respect to the X and Y axes to get 
(e2) e, = (cos A@)ez + sin O| (sin W)ex + (cos p)ey | 

= (cos @)ez + (sin @ sin w)ey + (sin 6 cos W)ey 


or equivalently 
(ey,e,) =snOsn yp 
(e2’) (ey,e,) =sinOcosy 


(ez,ez) = cosé. 


Substituting (e;) and (e2) into w = ¢’e,z + Wen + W’ez, we get finally 
@ = (¢’sin@sin py + 6’ cos) ex 
+ (¢’sin @ cosy — 6’ sin y)ey + (¢’cos6 + w')ez, 
so that 
wo, =¢' snésiny + 6’ cosy 
(@) wo = ¢'siné cosy — 6’ sin yp 
w3=¢' cosd+ yp’. 


These equations are sometimes called “Euler’s geometric equations’, in contrast 
to “Euler’s dynamic equations” (E) on page 334. 

In the next section we will apply these equations to analyze the motion of a 
rigid body with one point fixed, but since they are purely geometrical, depend- 
ing only on the fixed directions for the x-y-z axes, we can also apply them when 
the (x, y, Z)-plane is moving parallel to itself through some point around which 
we wish to consider our rotation, a situation that we will sometimes encounter. 


Rigid Body Motion 345 


The heavy symmetrical top. We are now ready to consider a symmetrical top, 
with one point fixed, acted upon by gravity. We will draw the top as a symmetric 
body, although we really only require that its inertia ellipsoid is symmetrical, 
and that its center of gravity lies on the rotation axis. ‘The picture we usually 


Ss 


(a) Toy ‘Top (b) Swivel Top (c) Lab ‘Top 


have in mind is the (toy) top on a non-slippery surface (a), perhaps with grooves 
around which to wind a string, which can be pulled off rapidly to impart a rapid 
rotation. Our theoretical top might best be realized as a body having its fixed 
point attached to a swiveling joint (b). ‘Textbooks often picture the top as in (c), 
where a heavy wheel spins on an axis attached to a joint supported on a heavy 
base. ‘The position of our top is then described by a rotation about the fixed 
point (to determine it uniquely, we should imagine that some point not on the 
axis has been marked). 

If the mass of the top is M, the total effect of gravity is equivalent to a single 
force of magnitude Mg acting on the center of mass, which we’re assuming 1s 
on the rotation axis, at some distance / from the fixed point.! 





‘. line of nodes 


' Of course, there is also an upward force on the fixed point of the top from the surface 
on which it spins, and horizontal frictional forces of that surface, all of which keep the 
point fixed; here (and previously in this Chapter) we are implicitly using the analysis of 
Chapter 6, with our configuration space being SO(3). 


346 Chapter 9 


Substituting the first two formulas of (@) into the formula for the kinetic en- 


ergy, 
1 l 
T= 5 fi (or + @”) + 5 130s", 


gives 


I I 
T = ue + ¢’? sin? 6) + 03", 


while the potential energy is simply V = Mgl cos 6, so conservation of energy 
SIVES 


I I 
i = (87 + $’? sin? 0) + 503s” + Meglcosé 


for a constant EF. We have not yet applied the third formula of (@), for the 
following reason. Note that the torque of the downward force of gravity is 
along the line of nodes, while the Z-axis is perpendicular to this line, which 
means that (L,ez) = 1/33 is a constant (hence the top is always spinning 
with constant angular velocity around its symmetry axis), and we can write our 
equation in terms of the new constant E=E- 513037 as 


*~ I 
(A) E= 5 (67 + ¢'? sin? 0) + Mgl cos 6 


[compare to equation (b’) on page 290]. 
When we do apply the third equation of (@), we find that 


(B) 1303 = I3(¢’ cos 6 + wy’) = la 


for a suitable constant a. 

Note, moreover, that since the z-axis 1s also perpendicular to the line of nodes, 
the component (L,ez) = (Jj@iex + I@z2ey + [33ez, €z) is also constant. 
Using the values of @; from (@), together with (e2’), we find that 


(C) (I; sin? 6 + [3 cos” 6)¢' + In’ cos@ = Ib 


for a suitable constant Db. 

Equations (B) and (C) are often not treated in this direct way, but are instead 
deduced from considerations about the Lagrangian (see Chapter 12), which 
provide the equations “automatically”, without even worrying about what they 
signify. In any case, however, (A), (B), (G) now provide everything we need. 


Rigid Body Motion 347 


We use (B) to write 
(1) Iw’ = la —13¢' cos8@, 
and substitute into (C) to obtain 


b—acos@ 


sin? 6 


(2) = 


[compare to equation (a’) on page 290]. Thus ¢ is known once @ is determined; 
moreover, by substituting (2) back into (1) we obtain 
Tia b —acosé 


(2 y= —— —cos6- 


3 sin 6 


? 


which shows that y is also known once @ is determined. 
To obtain an equation for 8, we substitute (2) into (A), ending up with 


(3) sin? 6 - 6’? = sin? 0(a — B cos 0) — (b —acos 6)” 
where the constants a and £ are defined as 


2E 2M gl 
a= —, p= 
I; I; 





Finally, if we set u = cos 0, then (3) reduces to 
(*) u’* = f(u) = (1 —u*)(w — Bu) — (b — au)? 


for a cubic polynomial f(u) [and parts of the succeeding discussion will call to 
mind the discussion of the spherical pendulum]. 
At this point we could write (*) as 


r= / du 
7 (1 —u2)(a — Bu) — (b — au)? 


so that u could be written in terms of Jacobian elliptic functions, leading to a 
formula for 0, and thence by (2) and (2") to formulas for ¢ and 9, theoretically 
solving the problem. But we can get a much better qualitative picture by con- 
sidering the general properties of the cubic polynomial f(u). Naturally, only 
the behavior of f for u € [—1, 1] is significant, since u = cos @. Note that the 
coefficient of u? in f(u) is B, which is positive. 

We assume for now that 1 and —1 are not roots of f, implying that a 4 +b, 
leaving the contrary case to be considered later in the game. Since f(+1) = 
—(b ¥ a)’, it follows that f is definitely negative at 1 and —1. But the f that 


348 Chapter 9 


arises for a top can’t be negative on all of [—1,1], since for any value of u 
occuring in (*) we must have f(u) > 0. Generally we will have f(u) > 0 at 


zy J 
-1 1 —1 1 


uz, U2 U3 ui 


(a) (b) 
some u € (—1,1), and then the graph of f looks something like (a), with f 
having two zeros uy < U2 € [—1, 1] (as well as a zero u3 > 1 of no interest to us), 
so that u lies in the interval [v1,u2]. There is also the special situation where 
there is just one (double) zero, uj = uz in [—1, 1], as in (b), so that we always 
have u = uy. 
As in the case of the spherical pendulum on page 291, differentiation of equa- 
tion (*) leads to a second order equation 
/ " / : pos : d7u / 
(’) 2u° = f' ou or, in Leibnizian notation, 272 =F (0), 


of the form encountered in Chapter 4 (page 128), so if cos @ is not constant, 

then it varies periodically between u,; and wz, the two places where u’ = 0, and 

naturally 6 varies similarly between 6; = arccos(u;) and 62 = arccos(u2), the 

two places where 6’ = 0, with the top rising and falling in a periodic fashion. 
As for ¢, since we have 


,  b-—au 
e= 1— wu? 
we immediately see that if a/b ¢ [u1, U2], then ?’ is never 0, so ¢@ varies mono- 
tonically, and the axis of the top traces out a curve on the sphere like that shown 
in (a). On the other hand, if a/b € (uj, uz), then the sign of ¢’ as @ varies 


with 1 —u? > 0, 








from 6; to 02 is the opposite of the sign of ¢’ as 6 varies from 62 back down 
to 6, so we obtain a curve with loops, as in (b). 


Rigid Body Motion 349 


Note that, although we drew the graph of f with u; > 0 and uz > 0, either 
or both could be negative, meaning that one or both of 6; and 62 can be greater 
than 2/2, with the axis of the top pointing below the horizontal. Naturally, we 
would need a Swivel ‘Top rather than a Toy ‘Top to realize this situation. 

Generally speaking, the motion of the top is determined by three periodic 
motions: (1) its fixed rate of rotation around its axis; (2) the change of 0, its 
nutation; and (3) the change of @, its precesston. ‘The three periods will usually 
be distinct, and the top returns to its initial position only in the exceptional case 
where they are all commensurable. 


The cuspidal case; fast tops. Our general discussion has left out some important 
special cases, among them the case of a double root of f; which we will defer 
to the next section. For now, we note that although we’ve seen that the axis of 
the top traverses a looping curve for a/b € (uy,uUz), we still have to examine 
the possibility that a/b is an endpoint u; of [w1, uz], so that at some time we 
have 6 = 6; and 6’ = ¢’ = 0. At such a time, in equation (A) we have 
E = Mel cos6j;, while the other terms are always positive. Consequently, for 
E to remain constant, Mgl cos@ must begin to decrease, so 6 must begin to 
increase, which means that we must have had 8 = @), rather than 62. Problem 5 


Z-axIS 


0; 
G2 


shows that the curve traced out by the axis of the top is perpendicular to the 
circle 6 = 6; at the cuspidal intersection points. 

This seemingly exceptional situation occurs whenever we have a top held 
spinning with its axis stationary at inclination 6,, and then simply release it at 
time ¢t = 0, without imparting any other motion to it, so that 6’(0) = ¢’(0) = 0. 
We could start the Lab ‘Top spinning while holding the end of its axle, or, if our 
‘oy ‘Top was designed to spin independently around the protruding axle (some 
toy tops are made this way, designed so that the top can be set spinning by 
pushing down on the axle), we could set it spinning and then carefully place it 
on the surface at a given angle. 

As our analysis shows, the top always starts falling (with @ increasing all the 
while), until it gets to the angle 62, at which point it starts rising until it gets 
back to the angle 6;. But we can say quite a bit more. 


350 Chapter 9 


Since ¢’(0) = 0, equation (2) for ¢’ gives 
b=acos6; =auj, say, 


and since we have E(0) = Mgl cos 61, our definition of the constants a and 
in equation (3) gives 
a = Bcos6; = Buy. 


We then have 


f(u) = (1 —u?)(a — Bu) — (b — au)? 
= (1 —u*)B(u, — u) —a(uy —u)’, 


which can be written as 
(* cusp) flu) = (uj — u){ BC —u’) oi a* (uy at u)|, 


so that, in addition to the zero u,; of f, the other zeros are the solutions of the 
quadratic equation 
0 = Bu? —a*u + (a*u, — B). 


If we set 


the solutions can be written as A + JA —2Au; + 1. Since one of these must 
be the irrelevant solution that is greater than 1, the solution of interest is the 
one with a negative square root, so 





i H2 


l 
2.cos @ 1 \2 
cos 62 = A— VA% —2Au, + =A-a(1- sete zat 


To evaluate A explicitly, note that from the definition B = 2Mgl/J, and the 
value of a given by equation (B) we have 


- (133)? 
41,;Mgl 





Now consider what happens for large 1, which simply means that 3 is large, 
1.e., the top is spinning rapidly. We can use the binomial theorem to write 


2cos6; 1 4 cos” 6 
cos =2-2(14+3]- cosft |-3] cos tee |e) 








i 12 2 


Rigid Body Motion 351 


SO 
sin” 61 
2} 


In particular, 62 — 0; will be small, so that we also have 





cos 62 — cos, + — 


cos 62 — cos 0; & (82 — 6;)(— sin 91), 
and therefore 

sn 6;  21;Mgl 
24. (133)? 

Thus, the faster the top is spinning, the smaller the nutatzon. 
Moreover, since the nutation is small, we can approximate the equation u’ 
f (u) by replacing the term (1 —w7) in (*cusp) by (1-17) = sin* 6;. If we write 
the resulting equation u’* = f(u) in terms of the new variable x = u; —u, with 


x’* = y'* = f(u), then our equation becomes 





65 — 01 ~ sin 0). 


2 — 


x’? = x(B sin” 6; —a7x), 


and the solution of this differential equation for the initial condition x (0) = 0, 

equivalent to u(0) = 4, 1s 

B sin? 0; 
2a 

Since this is a constant times (1 — cosat), the frequency of nutation is approxi- 

mately 


x(t) = (1 —cosat). 


_ 1303 
ae 
‘Thus, the faster the top is spinning, the greater the frequency of nutation. 
Finally, since 





,  a(uy —4U) ax 


—_ ; Pe’ ’ 
sin? 6 sin” 61 





our formula for x(t) gives 





Oe aa — cos at), 
2a 
which has an average value of 
B Mel 
2a 7 1303 


Thus, the faster the top is spinning the smaller the rate of precession. 

To put all this another way, as a top slows down because of friction it will start 
precessing faster, and with a nutation of larger magnitude, though of smaller 
frequency. 


352 Chapter 9 


If we start an actual top rapidly spinning at a fixed angle, and then release 
it without pushing it in any direction, the resulting very small nutation may be 
unnoticeable, especially as it is very likely to be damped out by friction because 
of its high frequency. So it can appear that the top simply starts precessing on 
its own, which is actually impossible, since the only force on it is gravity, which 
is perpendicular to the direction of the precession.! The point raised at the 
end of Chapter 5 would seem to be related to this fact, a matter that will be 
discussed later on, in the section on gyroscopes. 


Precessing tops. There are certain situations where a top will truly exhibit pre- 
cession without nutation. In the special case where there is only one zero of f 
in [—1, 1], so that wy = uz, we have 6 = 61, a constant, and then @¢’ is also 
a constant @; (and y’ is similarly a constant y;), so that we just get a curve 
circling at constant angular velocity around the parallel at 6;. 


Z-axIS 


Experimentally, we can obtain this situation by starting the ‘Toy Top or Lab 
Top spinning, as in the previous section, and then giving the top a horizontal 
shove to initiate precession. [o show mathematically that there are initial con- 
ditions that will lead to this case, we note that since wu, 1s a double root, we have 
f'(u1) = 0, and equation (*’) gives u”(u,) = 0, implying that 6”(6,) = 0. If 
we write equation (3) on page 347 as 


(b —acos6)? 
sin? 6 
differentiate it, divide by 0’, and then use the fact that 6”(6,) = 0, together 

with equation (2) on page 347, we find that 


E = ag, — (¢)7 cos 44. 


6'* = (a— Bcos@) — 


? 


Using the definitions of a and B, we then get as the condition for precession 
Mgl = $ (Jovy; — (1 — J3)d) cos 1), 


' Chapter 11 gives other examples of situations where the unobserved effects of friction 
serve to produce paradoxical results. 


Rigid Body Motion 353 


which is a quadratic equation for the constant @): 
(P) (1, — 13) cos 01(})* — (31), + Mgl =0. 
If we want to specify initial conditions at ¢ = 0, 


(9,6, 054, $, WYO) = (1, o(4), Wr); 0, 1, WD), 


then $(t,) and y(t) can be chosen arbitrarily, and after choosing a value for Wj, 
we can find ¢; satisfying (P), as long as we choose W; making the discriminant 
of (P) non-negative: 


(13)? > 4Mel(1; — 13) cos 94. 


When the discriminant satisfies this condition with the strict inequality holding, 
there will be two different solutions for ¢}, the “slow” and “fast” precessions. 
Our equation (P) can never be satisfied for the initial condition ¢; = 0: a true 
procession is possible only when we start the top with an initial precessional 
velocity. 

Notice that since the cubic f for a top always has at least one positive root, 
in the case of a precessing top any slight modification of f will have to have two 





nearby roots, so the precessing top is stable, in the sense that slight perturbations, 
like friction or a slight breeze, won't cause the top to suddenly start nutating 
through a large angle, in contrast to a situation that can arise in the next case 
that we will examine. 


Sleeping tops. The case where | or —1 is a root of f(u) arises for b = +a; 
we ll concentrate on the case b = a, since b = —a 1s basically the same situation 
with the top upside down. The simplest example occurs for a “sleeping” top, 
spinning with its axis vertical. Since we always have 6 = 0 we also have 0’ = 0, 
and then equation (A) gives E = Mglcos@, and it follows that we also have 
a = B. 

As with the case of the precessing top, we want to investigate whether the 
motion of a sleeping top is stable. Although the Euler angles aren’t well-defined 
for a vertical axis, we can still consider the cubic occuring in (+), 


f(u) = (1 = u*)(a — Bu) — (b — au)? 
=(1- u)B(1 —vy=a- Sa) 
= (1—u)*[B(1 + 4) — a7], 


354 Chapter 9 


which now has | as a double root, like either (a) or (b) below. Under a slight 
perturbation, the cubic f will again change only slightly, but in case (a) there 
will now be two roots both close to 1, so that the axis remains close to vertical, 





while in case (b) one of the roots will be close to the root u;, and the top will 
suddenly start nutating through an angle @ close to arccos u. 

We will definitely have case (a) if f”(1) < 0 and case (b) if f”(1) > 0. We 
find that f”(1) <0 is equivalent to 2B < a’, or 


2 2 4AMell, 


W3 
132? 


so we have stability if the top 1s spinning rapidly enough, with w3 satisfying this 
inequality, while the motion will definitely be unstable if w3 satishes the reverse 
inequality. For the boundary case w37 = 4Mel], Fi 13%, we find that 1 is a triple 
root of f, which again implies stability. 


The rising top. A cubic of type (b) for the sleeping top can also arise for a non- 
sleeping top whose initial conditions happen to give a = b and a = 8, and 
then the angle 6 can vary between arccosu; and 0. However, @ will only ap- 
proach 0 asymptotically as t — oo, since the solution of the differential equation 
u’* = f(u) with the initial condition u(to) = 1 is simply the constant function 
u(t) = 1. On the other hand, since 

lim ¢’ = lim Ae) —— 2 = = 

60 60 sin* @ 2 
the angle ¢ grows infinitely large as t — oo, so the axis winds around the top 
pole of the sphere infinitely often. Of course, aside from the fact that the top 


Z-axls 


Rigid Body Motion 355 


won't continue spinning forever, once the axis of the top gets close enough to the 
vertical, friction will make it indistinguishable from a sleeping top, so this motion 
will essentially just look like the time reversal of a slowly spinning sleeping top 
that dips down because of instability. 


The polar cuspidal top. ‘The final theoretical motion occurs when we consider 
the remaining case, where a = b, a ¥ B, with f’(1) = —2(a@ — B) 4 0, so 
that 1 is not a double root. The three different possibilities are shown below. 





The first two graphs are the same as those appearing on page 348, except 
that the “extraneous” root x3 1s now 1, which doesn’t change anything in the 
analysis. ‘The third graph is also like the first graph on that page, but since our 
interval is now [u1, 1] with a/b = 1 € [uy, 1], this behaves like the cuspidal case, 
except that the top circle now degenerates into the top pole. 





Gyroscopes. The basic features of souped-up laboratory or commercial gyro- 
scopes are all illustrated by the toy gyroscope (a), where the axis of the wheel 
rotates in an inner “gimbal” A, which can rotate in an outer gimbal B, as shown 
in the second picture, where the dashed line is the imaginary axis connecting 
the two points where the gimbals intersect. ‘The gimbal B, in turn, can rotate 
about the vertical axis, providing 3 degrees of freedom. For the sake of stability, 
an arrangement like (b) might be preferred. 





356 Chapter 9 


This is customarily called a Cardan suspension, after Jerome Cardan (Girolamo 
Cardano, 1501-1576), famous for his book Ars Magna (1545), which presented the 
solutions to the cubic and quartic equations. However, it seems! that Cardan 
merely described, but never claimed to have invented, gimbals, which appear 
to have been around at least since 140 B.c. in China. They can be used in such 
mundane applications as cup holders in boats and moving vehicles, and were 
used in the 19" century to hold lamps in Roma caravans. The first gyroscope 
seems to have been constructed by the mathematician and astronomer Johann 
Bohnenberger in 1817; it was given its name, from the Greek ydooc = turn and 
oxomdc = view, by Foucault, whom we will meet again in Chapter 10. 

The fixed point of a gyroscope is its center of mass, so the Mg/ term that we 
used in the analysis of the top 1s irrelevant here; the net effect of the gimbals 
and stand is to have no force on the center of mass, so that gravity is irrelevant, 
and we can just as well imagine the gyroscope at any angle, providing a picture 
closer to our previous picture of the top. A situation equivalent to the heavy top 
occurs when we exert a force on the inner gimbal. A steady downward force 





on gimbal A, indicated by the black arrow in the figure, produces a constant 
torque on the axle, which takes the place of the Mgl term in our analysis of 
the top, so the exact same analysis will apply (the downward force indicated by 
the white arrow would cause a torque corresponding to a —M gl term, so we 
we would have to turn our second picture upside down). 

Since we usually start the gyroscope spinning, and then exert the force, this 
corresponds to the cuspidal case. When the gyroscope is attached to its stand, 
the precession involves the outer gimbal rotating, while the small nutation of 
the axis involves the inner gimbal oscillating. For a heavy, rapidly spinning 
gyroscope, this nutation 1s quickly damped, and might even not be perceived, so 
it can seem that a push in one direction simply causes motion in a perpendicular 
direction. ‘The precession is sometimes demonstrated with apparatus like that 
in the figure below, where the balancing counterweight can be moved toward 


! See findarticles.com/p/articles/mi_m1310/is_1988__Oct/ai_6955856 for 
free access to an article in the UNESCO Courier. 


Rigid Body Motion 357 


the tall central axis to produce precession in one direction, or further away to 
produce precession in the other direction. 





The original gyroscope on its stand provides a good model for the experi- 
ment described at the end of Chapter 5. ‘The back half of the inner gimbal A 
represents the person’s arms, which can be rotated, principally by the shoulder 
muscles, within the back half of gimbal B, representing the rest of the person, 
with the stand representing the rotating bench. Although the surprising pre- 
cession is the main point of such a demonstration, in most cases the nutation 
is also noticeable, though it 1s probably usually perceived simply as the person 
struggling to rotate the axis. 


As we would suspect, when we push on the outer gimbal, the axis of the 
spinning wheel moves away from the horizontal, and the direction in which 
it moves, which we can figure out by the considerations of Chapter 5, can be 





stated as a simple rule: Given the direction of rotation of the outer gimbal and 
the direction of rotation of the wheel (a), the axis of the spinning wheel moves 
so that the directions of spin tend to align (b), with the two directions being 
identical when the the axis is vertical (c). his situation does not correspond 
directly to the heavy top because a constant push on the outer gimbal docs 
not produce a constant torque. What one observes when pushing on the outer 
gimbal is that the resistance tends to diminish, and the axis suddenly flips up 
to the vertical direction. Continuing to push on the outer gimbal then has no 
eflect, but if we push the gimbal in the other direction, the gyroscope suddenly 
flips over and starts spinning in the opposite direction! 


358 Chapter 9 


Of course, neither of these observed phenomena can occur exactly as just 
described. When the axis first reaches the vertical direction, it can’t suddenly 
stop there. It must oscillate about the vertical in some way, though this oscilla- 
tion is quickly damped by friction. Once the axis has stabilized in the vertical 
direction, continuing to push on the outer gimbal should theoretically have no 
effect at all. In practice, the axis will always wobble a bit, with the rotation 
of the outer gimbal simply helping to bring it back to vertical more quickly. 
However, if we now push the outer gimbal in the other direction, then as soon 
as the axis deviates the slightest bit from the vertical, the axis will have to move 
in the other direction, in order for the directions of spin to align, so the whole 
process is reversed, giving the impression that the gyroscope suddenly flips over. 

The device in (a) of the figure below is substantially the same gyroscope, with 
the torque produced by rotating the base. Alternatively, as in (b) we can hold 


i 


our arms outstretched in front of us, with one hand holding each end of the 
base, and then rotate ourselves, again causing the axle of the wheel to flip up 
to the vertical direction. This 1s actually a somewhat misleading description of 
the axle’s behavior, since the direction in which it flips has nothing to do with 
“up” or “down”; what we should say is that it causes the axle to line up with 
the axis of the rotation, namely, our body. 

Finally, part (c) of the figure shows our device positioned on the equator of 
the earth in such a way that the earth’s rotation on its axis takes the place of 
the rotating person. The base 1s now in the plane of the equator, so that from 
the point of view of an actual person standing at the equator, the base and the 
initial position of the axis are vertical (the base 1s presumably anchored to a 
wall of some sort, which takes the place of our outstretched arms). ‘The axle of 
the wheel will move in a horizontal plane, lining up with the axis of the earth’s 
rotation, and thus point to the geometric north pole, or at any rate oscillate 
about this direction, so our device ought to serve as a “gyrocompass”. We just 
have to see whether this will be practical, and to take into account what happens 
at other latitudes. 





(b) 


Rigid Body Motion 359 


The gyrocompass. We will begin by considering our original gyroscope. ‘The 
mathematical analysis will go right back to the Euler equations, in fact right 
back to our original equation 


(Ex) tT=L’4+oxL. 


Our rotating coordinate system (€1,€2,€3) will have e; poimting along the axis 





connecting the points where the gimbals intersect, which always lies in a hori- 
zontal plane, e2 a perpendicular vector lying along the wheel, and e3 pointing 
in the direction of the axis of our wheel. As usual, /3 will be the moment of 
inertia around this axis, while 7; = J» will be the moments of inertia about a 
diameter of the wheel. The only force on the wheel is the torsion caused by 
rotating the outer gimbal. 

If a is the constant angular velocity of the outer gimbal, and @ is the angle 
from the vertical to e2, then w, the angular velocity of our coordinate system 
in the body coordinates, 1s 


@ = {(a cos 8)e3 + (—a sin O)e,} + 0’e2 
= (—a sin 0)e; + 6’e2 + (acos 6)e3. 


If A is the angular velocity of the wheel about its axis—the large angular ini- 
tial velocity we give it, plus the acos@ term, plus any small changes that the 
combined forces may produce—then the angular velocity of the wheel is 

(—a sin 8)e; + 6’e2 + Ae3, 


and its angular momentum in body coordinates is 


i (—la sin Oye ae 1, 0’e> ae [3Ae3. 


360 Chapter 9 


We can write, for some X and Y, 


Fe — Xe; + (1,0” Jer + (13A’)e3 
o xL = Ye, + (13aAsin 6 — I,a’ sin 6 cos 0) en + 0, 


and t is a multiple of e;, so by looking at the coefficient of e2 in (E,) we obtain 
(*) 1,6” + InaAsin 0 — I,a? sin @ cos 6 = 0. 


Looking at the coefficient of e3, we simply get A’ = 0, so that A is constant. 
These conclusions can presumably also be derived from the equations (E;) on 
page 334. 

This analysis can be immediately applied to a gyrocompass at the equa- 
tor, where a = 1 revolution per 24 hours, or 1 revolution per 86,400 seconds, 
while the angular velocity A of the wheel in a typical gyroscope is about 20,000 
revolutions per second, soa < A = > a* «aA. If we therefore ignore the 
term with the factor a” in our equation, we simply get 


0” +k? sind =0 k= ue 

I 
with solutions a multiple of 6(¢) = sinkt, oscillating about 6 = 0. For a thin 
rotating disk, /; is about /3/2 (Problem 5-6), so that k is approximately V2a4, 
giving a period at the equator of approximately 27/V2aA ~ 9.24 seconds for 
A = 20,000. One can determine true north by bisecting the angle of swing, or 
wait for friction to cause the motion to cease. 

When the gyrocompass is at latitude A, we have 





@ = (—asin 6 cosA)e; + (0° + asindA)e2 + (acos§ cosd)e3 
L = (—I,asin 6 cosA)e; + 1)(0’ + asin A)e2 + (13A)e3. 


and we end up with the equation 


1,0” + IsaAsin 6 cosd — I,a’ sin 6 cos 9 cos* A = 0, 


Rigid Body Motion 361 


which we approximate by the same equation as before, but with 


—_ [I3aA en 
I; 


so that the period 27/V2aA cosdA becomes longer, and the gyrocompass less 
useful, as the latitude increases; it is ~ 11 seconds for latitude 45°, and infinite 
at the north pole. 

Of course, all sorts of ingenious and complicated engineering mechanisms, 
and modifications, are required to make a practical gyrocompass, especially one 
that will work not only on land but on a ship, and respond quickly enough to 
allow automatic corrections to keep the ship on course. 


Precession of the equinoxes. Finally, we should add a few words about the 
astronomical “precession of the equinoxes”, a very slow precession of the axis 
of the earth not related to the Euler precession mentioned on page 339. 

If the earth were a perfect sphere, homogeneous, or even radially symmetric, 
the only gravitational effect of the sun on the earth would be a force directed 
toward the sun; in particular, the sun could not produce any torque on the 
earth. Because the earth isn’t exactly spherical, the sun does produce a torque 
on the earth, but this torque is not directly related to the earth’s spinning on its 
axis, except for that fact that this spinning is what produces the bulging near 
the equator in the first place. In fact, this torque is due to the “tidal forces” of 
the sun’s gravitational field (Problem 4-20), since a uniform gravitational force 
would produce no torque on the earth no matter what its shape. 

This small torque makes the spinning earth act like a gyroscope with a slow 
precession having a period of about 26,000 years, with constellations appearing 
in the night sky in different seasons during this cycle (the additional orbital 
motion of the earth around the sun is so small compared to the distance to the 
constellations that it is usually completely ignored). At the “spring equinox”, the 
time when the sun is directly overhead at noon, the sun wil appear in different 
constellations, hence the name “precession of the equinoxes”, known even to 
ancient astronomers/astrologers. In a mere 600 years or so the sun, now in 
Pisces, will be in Aquarius, as anticipated in the musical Hazr. 

Calculations of the precession become exceedingly involved (Newton pre- 
sented a geometric one, with fudging, cf. Cohen-Whitman [l; pg. 265]), and 
are seldom even mentioned in text books. A description of one the first serious 
attempts, by d’Alembert, can be found in Hand and Finch [1; pp. 317 ff]; an 
exercise in Goldstein [1; Chap. 5] also tackles the problem. 


362 Chapter 9 


ADDENDUM 9A 


THE EULER EQUATIONS FOR 
ROTATING PRINCIPAL VECTORS 


THE ROLLING DISC 


When J; = 12, it is occasionally useful to choose perpendicular unit principal 
vectors U; and up that are not fixed in the body, but that are rotating in the body. 
For simplicity, we will actually consider the equations arising for an arbitrary 
rotating orthonormal basis (1, U2, U3) at the fixed point O, although in practice 
the cases of interest involve only two equal principal moments of inertia, with u3 
being fixed. 

The rotating orthonormal basis has its own angular velocity &, and we write w 
and & in terms of this rotating coordinate system as 


@ = @1,:°U; + @2°U2+ 3:3 
&=§& -u,+&-u. + &-u3. 


Similarly, we write the angular momentum vector L of the body as 


L =a]; -U, + @2!2-U2. + 0313-3. 


b] 


Our rotating differentiation ’ now refers to the rotating coordinates u; and 


the equation 
t=L'=L'4+ &xL 


then gives us 
tT, = L1@4' — In@283 + [30382 


(T) T2 = Inw2' — 13038 + 110183 


T3 = 1303’ — 11@1& + Ip@21. 


We will also consider the more general case where the center of mass x of 
our rigid body is moving, and the w; and § refer to the appropriate rotations 
about the center of mass, and we write the velocity v = x’, computed in the u; 
coordinates, as 

V= v1, -Uy, 4+-:02°U2 +: U3°U3. 


If F is the total force on the rigid body, of mass m, then 


F =mv’=m(v'+ @xvV), 


Euler Equations for Rotating Principal Vectors; Rolling Disc 363 


which gives us 
Fy = m(vy' — v2&3 + 3&2) 
(F) Fy = m(v2' — v3& + 013) 
F3 = m(v3/ — vy & 4+ 261). 


We will apply these results to find a set of equations for the general case of a 
disc of radius a and mass m rolling on a plane;! a very different approach will 
be used in Addendum 12A. 

In the figure below, the line / is the intersection of the plane of the disc with 
the (x, y)-plane, and /’ is the line in the plane of the disc parallel to / through 





the center O. We choose u, to lie along /’, with uz lying along the line from O 
to the contact point. We let ¢ be the angle through which the contact point has 
rotated from the x-axis, as before, while @ is now the angle that u3 makes with 
the vertical. If one imagines the center O moved over to the origin, then our ¢ 
is just the Euler angle ¢ from page 342 and our @ is the Euler angle @ (and 1’ is 
the line of nodes). It will actually be more convenient for us to write things in 
terms of the angle 0 = 2/2 — 6, with the upright disc corresponding to v = 0. 

The vectors v and @ are related by the rolling condition (compare page 240) 


v+ @ x (—au2) = 0, 
which gives us 
(a) v1) = —aw3, v2 = 0, v3 = aa, 
so the w; determine the v;. 
Yo find the components § of & with respect to the u;, we note that the 
coefficient of uy is just 6’ = —?’, while there is an angular velocity of ¢’ about 


the vertical line through O, and decomposing this into its components with 
respect to U2 and u3 we have 


(b) & = —0’, E> = ¢' cosd, &3 = d’sin dv. 


' From Synge and Griffith [1). 


364 Chapter 9. Addendum 9A 


Since the angular velocities of the disk and the triple uy, u2,u3 differ only in 
the u3 component, we also have 


(c) w, = —0, w2 = d’ cosv. 


Because of equations (a) —(c), we so far really have only three unknown functions, 
gd, 3, and w3 = the rate at which the disc is rolling. 

The total force F on the disc is the downward gravitational force of magnitude 
gm and the reaction force R of the plane on the disc at the contact point, which 
we write in terms of three more unknowns, 


R = Ri -u, + R2-U2 + R3- us, 


so that 
F = R—meg(cosv -u2 + sind - uz). 


Using (a)—(c) in the equations (F) to express everything in terms of our six 
unknowns, we have 


R, = —ma(3' + 0’¢' cos #) 
(F’) Ry — mg cost = —ma(0'? + ¢’ sin 8 - 3) 
R3—mgsin¥ = —ma(v" — ¢’ cos} - 3). 


Similarly, equations (T) give (noting that the downwards gravitational force 
produces no torsion around O) 


aR3 = 10" + ¢'* cosd sind — 139’ cos B - 03 
cn) 0 = 1,(¢' cos?)’ + 138'w3 — 11¢'8' sin 3 


ak, = 1303. 


Equations (F’) and (T”’) are now 6 equations for 6 unknowns 9%, ¢, w3, Rj, 
R», R3, and in terms of these unknowns we can find the @; by (c), and then 
the v; by (a). Although we would obviously have to resort to numerical solutions 
in general, we can examine some special cases, and root out some additional 
information. 

One obvious solution is 


V0 R; =0 
@ = constant Ry = mg 
w3 = constant R3 = 0, 


Euler Equations for Rotating Principal Vectors; Rolling Disc 365 


which is just the disc rolling vertically along a straight line. We can also look 
for a solution with the disc rolling along a circle, inclined at a fixed angle, 


v = constant 
g@’ = constant 
w3 = constant. 


Equations (F’) then become 

R; =0 

Rz = m(-a¢’ sin? -w3 + g cos?) 

R3 = m(ad’ cos¥ -w3 + gsind), 
and when we substitute into the first equation of (T’) we find the necessary 
condition 

(13 + ma’)d’ cos 3 -w3 + mgasind = 1,9’? sind cos 

(the angle at which the disc is inclined is related to the centripetal force that 
must be exerted in order for the disc to move in a circle). 

We can also use the equations to investigate the stability of the straight line 
motion. When we roll a coin, or a hoop or thin tire like a bicycle tire along a 
surface, it stays close to straight line motion when it 1s spinning rapidly, but as it 
slows down it suddenly wobbles and falls over. In the straight line notion, the 
quantities 

vy. v'. ae d’, gp”, Ww, R,, Ry gocste A 1 2 R3 


are all 0, so they will all be small for a small deviation from straight line motion 
and equations (F’) and (I’) give, up to first order, 


(1) Ry = —maw; (4) aR3 = 1,0" a 13d" 'W3 
(2) Rz-mg =0 (5) O0=1,¢" + 138'o3 
(3) R3 —mgd = —mad" + mad’ - a3 (6) aR, = 1303’. 


Equations (1) and (6) give w3 = constant, and (5) then gives 
1,¢’ + 1333 = constant. 
Substituting this into the equation obtained by eliminating R3 from (3) and (4) 
then gives us 
Av” + Bd = constant 
A= nh + ma’), B= 13(13 + ma*)w3” — Iymga. 
This gives small oscillations for B > 0, but unbounded solutions for B < 0, so 
for stability we need B > 0, or 
> Iimga 


“3 33 + ma?) 


366 Chapter 9 


ADDENDUM 9B 


SECRETS OF 
THE HERPOLHODE 


Since the center O of the inertia ellipsoid is at a constant distance from the 
invariable plane as the ellipsoid rolls on it, the polhode can be described as the 
set of points on the ellipsoid whose tangent planes are at a fixed distance from 
the center. ‘This definition can be made for any ellipsoid, even though not every 
ellipsoid is an inertia ellipsoid (Problem 5-10). Ifthe equation of the ellipsoid is 


ax* + by? +c¢z* =1, 
a computation shows that the distance d from the tangent plane at (x, y, Z) to 
the origin isd = 1/V/a*x* + b*y* +c?z?, so these general polhodes satisfy 
ax* + by? +¢ez7 =1 


J 
a*x* + b*y* +727 = mR D, say, 


for various constants D. 


For the case of an inertia ellipsoid, we will switch to x1, x2,x3 for the com- 
ponents of a point in the body coordinates, so that it has the equation 


1x17 + Inx2? + 13x37 = 1, 


and we will let x;(¢), x2(¢), x3(¢) be the body coordinates of the point of the 
polhode corresponding to some distance d = 1/’D. Then we have 


x17 + x97 + x3° =r45 
(1) Ix" + Inx>” + [3x37 ==.| 
1y°xy° + Ip? x2? + 13*x3* = D, 
where r is the distance from the point with body coordinates x1, x2,.x3 to the 
point P in the invariable plane directly below the fixed point O, which will 


be used as the origin for polar coordinates (r,¢) in that plane. Recall also 
(page 337) that for the inertia ellipsoid, the distance d is given by 


(2) d = id 6 
ep. 


JE 


NS 
Vn. 


Secrets of the Herpolhode 367 


Setting 
A = ( — 12)U2 — I3)U3 — 11) 


and solving the three equations in (1) for the x; in terms of the J;, r and D gives 


oe) ie?) 


) Gi — 

xX A (r a1) I 1s1aD 
In1, (1; — I3) (3 — D)(1, — D) 
3 Diets Ne for Pig eee es 8 ee 

(3) x2 mi (r a2) 2 IaI, D 

LiIlo(b—-—I I,—D)(b—D 

aoe 1/2U2 ee pee )Uz — D) 

A I,lD 


We will assume that 1) > In > J3. If 1/d* = D = 11, so that d = i/Vh, 
the smallest semi-axis of the inertia ellipsoid, then the polhode and herpolhode 
are just points, and similarly if D = J3. The special case D = J, will be 
disposed of at the end, and we will assume that D 1s between /2 and /3, the 
case where D is between J; and /3 being similar, with various signs changed. 
In the current case, A < 0 and a1, a2 > 0 while a3 < 0. 

Since the points on the polhode are of the form w/</2T, their coordinates 
satisfy 


(4) Wi = JOT Xi ae 
and the Euler equations yield 


Tx" + V2T (13 — [2)x2x3 = 0 
(9) 12X2' + V2T (11 — I3)x3x1 = 0 
13x3' + V2T U2 — 11,)x1x2 = 0. 


The first equation of (1) gives rr’ = xx’ + yy’ + zz’, which together with (5) 
SIVES 





dr I, — 13 In,-T; I, -—Ip AV 2T X1X2X3 
—<— J 2T ——___—_ a te es 
dt tae I ms I> ' [3 I, lols 


and then using (3), 


(A) a = J/2T /—(r2 —a,)(r2 — az)(r2 — a3). 


368 Chapter 9. Addendum 9B 


Finding a formula for d¢/dt will be quite a bit more interesting. In the 
figure below, we have drawn the polhode cone shown in (b) of the figure on 





page 337, together with the cone from the same origin O to the herpolhode. As 
the polhode cone rotates about the point O, it moves, generator by generator, 
on the herpolhode cone, providing a mapping from the polhode cone to the 
herpolhode cone, and there is no stretching during this process, so this mapping 
must be an zsometry. 

One can show this formally by considering the parameterizations 


(s,t)t> s-p(t) of the polhode cone 
(s,t)t>s-h(t) of the herpolhode cone, 


where p is the vector from O to a point on the polhode, and h the vector 
from O to the corresponding point on the herpolhode. Since each generator 
of the polhode goes to a generator of the same length on the herpolhode, 0/ds 
has the same length at corresponding points. Since rolling implies that equal 
lengths are marked off on the polhode and the herpolhode at all times, 0/dt has 
the same length at corresponding points. And since the mapping at each time 
is, up to first order, a rotation about a generator, the inner products of 0/ds and 
d/dt have the same values at corresponding points. 

This means that the region between two generators of the polhode cone has 
the same area as the region of the herpolhode cone between the corresponding 





generators, and so the same 1s certainly true for the rate of change of these areas, 
keeping one generator fixed and varying the other. But for the rate of change, 


Secrets of the Herpolhode 369 


we Can approximate these curved regions with the triangular regions bounded 
by the generators (a), so that the rate of change of the triangular regions must 
be the same. And for the triangular regions we can say that the rate of change 





of the projections of the areas on the invariable plane must also be the same. 
In the case of the herpolhode cone, this projection (b) is simply the triangular 
region between two radu from P to the herpolhode, and the rate of change is 

dp 

1,227 

rT 
since $r? d@ is the integrand for area in polar coordinates. So we just have to 
determine the rate of change of the projection of the corresponding triangular 

region of the polhode cone, the triangle having p(t) and p(t + /) as its sides. 

We do this in two steps, first finding the answer for the projection on the var- 
ious coordinate planes in the body coordinates. If p(t) = (x1 (¢), x2(t), x3(t)), 


then for the projection in the (x2, x3) plane we are looking at the area of the 
triangle bounded by the lines from the origin to 


(x2(t),x3(¢)) and (x2(t +h), x3(¢ +h)), 
which is just half the determinant 
X2(t)x3(t + h) — xa(t + h)x3(t), 


and the rate of change is thus 4 (x2x3' — x3xX2')(t), with similar formulas for the 
other planes. 
Note that equations (5) give 


™, 
Pa 


x1yV2T 


(Io — In) x2? + 13(11 — 13)x37] 
In 13 





X2X3° = X3X2' = 


and the quantity in brackets is simply 7; — D, as one sees by eliminating x? 
from the last two equations of (1). Similarly for the other two expressions, so 


370 Chapter 9. Addendum 9B 


that we have 

x,V2T() — D) 
InI3 

; X2V 2T U2 ron D) 


(6) X3X1' — X14 X3 = Lh 


X2X3' — X3X2' = 


| x372T(3 — D) 


/ 
XiX2 — XX, = 
[312 
Now we note that since the vector L is, in body coordinates, 
L = (a1 11, @2/2, 03/3), 


the cosines of the angles that L makes with the axes in the body coordinates 
are 





Ol i S27 x; 
ae 
In the standard coordinate system, where L is now the z-axis, these give the 
cosines of the angle between the z-axis and the coordinate planes in the body, so 
using (6) we find that the rate of change of the area of projection of the triangle 
in the polhode cone onto the invariable plane is 


by (4). 











2T (1, -—D 
L 


I,-—D Iz; -—D 
1x1" [px 13x3° |. 
LT, 1X10 + a, 9 lars re 3X3 ) 


Replacing the x;* be their values in (3) we find that 


de pe + E) 


B nee 
(B) "at L 


for 


ps (h ~ P)U2 — D)U3 — D) ee ae 


I,I213D 
From (A) and (B), we then get 


dr ar dp 7 RID ae ae a) 


dp dt dt r7+E 


For those adventurous enough to compute the curvature «, using Problem 6, 
it will appear that the formula involves positive terms together with the factors 
(Uy +13 —I1), Gi + [3 — In), and (4) + In — J3), so that « > O for inertia 
ellipsoids, where each of this factors is positive. 


Secrets of the Herpothode 371 


In the case D = In, we have E = 0 and a; = a3 = 0, and 


5 =r In Vaz — r?, 


which we can integrate explicitly. Writing 
dr 


7 rVJIpVJVaz —r? 


and using the substitution r = ,/a2/u we obtain 


do 





j= du 
V azI2~v uz — | 
SO 
g=- cosh"! u = z cosh™!u, say. 
re >, ’ ’ 
or 
= = coshid¢. 





The dashed line in the figure comes from negative values of @. 
Geometrically, D = Iz means that we are rolling the ellipsoid along one of 
the two ellipses that make up the polhode. If we start at an intersection of 





the two ellipses, the solid spiral in the above picture of the herpolhode is the 
path obtained by rolling in one direction, the dashed spiral the path obtained 
by rolling in the other direction. Choosing the other ellipse simply gives the 
mirror image of this spiral. 


This material is adapted from Appell [1; Vol. 2, sect. 393]. 


372 Chapter 9 


PROBLEMS 


1. Equations (1) and (2) on page 334 can also be written as 


ho," + Ina" + I3w3" oe 
Ty? wy" + Ip*@* + 1373? = L. 
(The fact that the left sides of these equations are constants can be derived di- 


rectly from the Euler equations: for the first, we multiply the i* Euler equation 
by w; and add; for the second we multiply by J;@;.) 


(a) Solve to obtain 


Ww? = P- Ow? 


3” = R- Sw” 


for positive P, QO, R, and S, and conclude that 





12 I3— fy : 2 2 
w2* = i (P — Qa2*)(R — Sw2°). 


(b) ‘Transform this to 


a6)" 2 a-e)0-Ke) f=, r= pt 

Ae ) B’ P 

for positive constants 6B, p and k. The solution to this equation is the elliptic 
function sn, so that 


w2 = Bsn[p(t — to) 


for a constant fo, and @; and w3 can then be found in terms of @2 (in fact, it 
turns out that they can be written in terms of the elliptic functions cn and dn). 
Further details can be found in several sources, including Synge and Griffith [1] 
and Landau and Lifschitz [1]. 


2. For the symmetric top, let P be the plane containing L, w, and u3. 


(a) P rotates with frequency A around us, so the angular velocity of the body 
around P is —Au3. 

(b) The angular velocity of P in the inertial frame is L/J;. 

(c) The angular velocity of the body in the inertial frame is L/J; — Au3 = o. 


3. (a) For a rotating body with 1; < Jz < I3, suppose we have @ = 0, w2 = 0, 
#3 = constant, and we consider small changes in @; and @2, with w3 staying 


Rigid Body Motion 373 


constant. Show that there are solutions @;(t) = a;e?’ with 


Deke (13 ie In)(Q = i, 2 
P eT 35 


so that we have a stable oscillatory solution, and a similar result holds for 
@®, = constant, while in the case of w2 we will have an unstable exponentially 
increasing solution. 


Problem 4-12 may now be used to show that in general, rotations with @; or 
3 constant are stable, while those with w2 constant are unstable. 


4. Given two vectors a,b € R°, suppose that A(t) is rotation about a by the 
angle a(t) and B(t) is rotation about b by the angle b(t). Show that the angular 
momentum vector w of C(t) = B(t)A(t) is given by 


w(t) = a'(t)-a + b(t) - A(t)(b), 


and generalize to multiple compositions. 


5. (a) Recall that for a curve given in polar coordinates by r = r(@), Le., 
parameterized by | 


x(~) = r(p)cosd, y(b) =r(g) sng, 
the slope of the tangent line 1s 


r(p)cos¢? + r’(d) sing 
—r(p) sin g + r'(p) cose 
If this tangent line makes an angle of 6 with the horizontal axis, so that a = 


B — @ 1s the angle between the tangent line and the line from the origin to the 
point, then 


r(@) 
r'(p) 





tana = tan(B — ¢) = 





In Leibnizian notation we have 


_ ge 
tana = r/(dr/dd) = Wg 


which thus gives tana when we instead consider “¢ as a function of r”. 





374 Chapter 9 


(b)! Consider the projection of the curve traced out by the axis of a top onto 
the (x, y)-plane, where for the functions r(t), @(¢) we have 


(r, p) = (sing, 6) = (V1 — cos* $, b) = (V1 —Uu?, 9). 


Opportunistically mixing ’ notation for derivatives with respect to time with 
Leibnizian notation, show that 


tan a geo se ay 
n = Sf —S PF 
dr dt dr 
— (u*—1)¢' 
7 uw 


Recalling that b = acos¢@ = au, (page 350), we see that the formula for tana 
has a factor of (uw — u;) in the numerator, whue the denominator has only the 


factor 
uu’ = ur (u —U1)(u — U2)(u — U3), 


so that when u = u, we have tana = 0,1.e., the tangent line is pointing radially 
toward the origin. 


6. From the equations x(@) = r(¢)cos¢?, y(d) = r(¢)sin@ in Problem 5, 


find both the first and second derivatives of x and y, and then use the formula 


x'y" -_ pix 
oe (x/2 ae y/2)3/2 


for the curvature of a curve t (x(t), y(t)) to deduce that 


24 oy (20 i 
Ph ee 
do dg? 


dr 3/2 . 
2 oe: 
(+ (3%) 
“ . (a) If all principal moments are equal to J, so that t = Jw (page 190), use 
the vector form (E,) of the Euler equations to show that t = L’ also holds zn 


body coordinates. 
(b) Also deduce this from the equations (E;). 


! From Cabannes [I]. 


Rigid Body Motion 375 


(ay The situation considered in Addendum A, where [; = Jo, can also be ap- 
| proached somewhat differently. Let w be the angular velocity of the rotating 
orthonormal bases (u;,U2,U3) and let L = [,@, -Wy + [)@2- Uy + 1303 - U3 
be the expression for the angular momentum with respect to these axes. If a 
is the angular velocity of fixed axes in the body with respect to these axes, then 

the angular momentum L with respect to fixed axes in the body is 


l=L+a/ze3. 
Use t = L’ + w x L to show that the Euler equations become 


tT = Nay’ + U3 — M1)@20@3 + ham 
tT = Noe’ + ( — I3)@301 — 1300, 


3 = 13(@3' + a’), 


or deduce the results directly from the equation (T) on page 362. 


CHAPTER 10 


NON-INERTIAL SYSTEMS 
AND FICTITIOUS FORCES 


[° Chapter 9 we derived equations for the motion of a body in terms of a 
coordinate system located within the body itself. More generally, we now 
consider what the equations of motion for any particle become in an arbitrarily 
moving, usually non-inertial, coordinate system. One of the main reasons for 
this investigation is that the behavior of numerous common phenomenon de- 
pend on the fact that they are really being observed in such a moving coordinate 
system, namely one located on the rotating earth. 


The basic equations. For the case of rotating coordinate systems, we derived in 
Chapter 9 the basic formula 


ro=r+ xr 
for any curve r. Applying this to a particle x, and then extending our compu- 
tations, we have 
/ 9 
xX =X + @XX 
“= (x’)'+ @ Xx + @Xx’ 
= (x? + wxx’) + ® Xx + @x(x’>+ @ xx) 
=x” + 2-0xx’ + ow’ Xx + wx (@ XxX). 
We are going to be working almost entirely in the rotating coordinate system, 
so we will set x’ = v, with the understanding that v denotes the velocity as 
computed in the rotating coordinate system, and similarly we will set x” = a, 


where the acceleration a is also computed in this rotating coordinate system. 
Multiplying our equation by the mass m of x, and rearranging, we have 


ma = mx” — m-@ X (@xXx) — 2m-wxv — m-@' xXx, 


and if F is the force acting on x we can write this as 





Non-tnertial Systems and Fictitious Forces 377 


This equation says that, using measurements in the rotating coordinate sys- 
tem, the particle behaves as if it were undcr the influence of the force F together 
with three other “fictitious forces”: 


—M-@ X (@ Xx) the centrifugal force 
—2m-wxv the Coriolis force 
—m-@' xx the azimuthal or Euler force 


If our moving coordinate system includes translations, with the origin being 
translated to a new origin b(t) at time f, then we will have one more term in 
the equation, giving another “fictitious force”, 


—m-b” the translational or acceleration force. 


These “fictitious forces”, always having the factor m, are just the forces that 
would be needed to account for the not-at-all-fictitious correction terms that 
need to be made to the acceleration because our observations are made in the 
moving coordinate system. ‘They are useful, though often confusing, theoretical 
constructs precisely because, as mentioned at the beginning of this chapter, we 
sometimes do make observations in a convenient coordinate system that jis not 
an inertial coordinate system, the rotating earth being a prime example. 


The translational or acceleration force. ‘Uhis force is easily envisioned by imag- 
ining that you are sitting on a totally frictionless flat cart at rest; if the cart 
accelerates, you remain at rest, so you will seem to accelerate backwards to the 
cart—in the cart’s coordinate system, it appears (if the observers on the cart are 
insensitive to the fact that they are being accelerated) that there is a force act- 
ing on you, of which you are blissfully unaware. If the cart has a back to lean 
against, so that you don’t fall off, then you will feel a force, which is not this 
fictitious translational force, but the opposite force that keeps you on the cart. 
A more involved example involves what happens when we push a container 
of water along a surface, first accelerating it to get it moving, and then letting it 
decelerate to come to a stop; the water will tilt toward the rear of the container 


) —— 


as the container accelerates, and tilt in the other direction as it decelerates. ‘The 
reason is that the water is subject to the force of gravity, straight downwards, 


378 Chapter 10 


plus a horizontal fictitious force, so that the resultant is at an angle, and the 
effect is the same as if gravity were actually acting at that angle. 

At first this example seems to be a typical case where the physicist’s way of 
looking at things is confoundingly different from the mathematician’s; we seem 
to have resorted to a physics trick instead of analyzing the problem in terms of 
the laws of motion. But we’re actually in no position to make such an analysis, 
since we currently have no principles for dealing with a liquid. In fact, even in 
the case where there 1s no acceleration at all we haven’t discussed any principles 
to demonstrate that the surface of the water will remain horizontal! 

But even if we can’t give an analysis, the physicist recognizes a principle, and 
the mathematician should recognize what might be called an “invariant” of the 
problem: the surface of the water at any point is always perpendicular to the 
total force at that point. 

Of course, as good physicists and/or mathematicians, we should seek to gen- 
eralize this example, and the next force will play a role in this. 


The centrifugal force. ‘The complicated looking formula for the centrifugal force 
can be expressed more simply geometrically. If u is a unit vector along w, and 
@ = ||, then the centrifugal force is (recall the formula on page 189) 


L 


—m-@ X (@ x x) = —m- ((@,x)@ — wx) (u,x)u sax 
_ ma? (x — (u, x)u) 
u 
= mwx~, 


where x~ is the vector from the line along » perpendicular to x, so the magni- 
tude of the centrifugal force depends on the distance from the axis of rotation, 
and is directed away from it. 

If you are revolving rapidly with a weight suspended by a string in front of 
you, and then release the string, the weight is now free to move in a straight 
line, and seems to fly away from you. In your (rotating) coordinate system it 
acts as if there is a force on it, the centrifugal force. Until the time of release, 
the weight has stayed in front of you because of a counteracting actual force 
that you exert on it, the “centripetal” force. By the third law, the weight is 
exerting an equal and opposite force on your hand. Unfortunately, the name 
“centrifugal force” is often applied to this force, leading people to think that 
they are “feeling the centrifugal force”, and wondering why it should be called 
fictitious—yet another example of how easily the notion of fictitious forces can 
lead to confusion. It might help to speak of the translational and the centrifugal 
“acceleration corrections”, remembering that we multiply them by mass to get 
forces, which we then of course divide by the mass to get accelerations. 


Non-inertial Systems and Fictitious Forces 379 


To generalize our previous example involving liquids, suppose that we rotate a 
cylindrical container of water with angular speed w. ‘Then a particle of mass m 





on the surface of the water at distance r from the center 1s subject to a downward 
force mg and a centrifugal force of mw7r, as in (a). Our invariant says that the 
resultant force should be perpendicular to the surface of the water, whose profile 
curve is the graph of some function f From (b) we see that 


2 
f'(r) = tan6 = = 
and thus ae 
Sr) = —-r’, 
2g 


and the profile curve is a parabola. 

Another simple example of centrifugal force arises for the problem of a bead 
on a rotating rigid wire, which we considered on page 227, where the angular 
rotation 1s @ = |w| for a vector @ pointing out from the plane of the paper. In 


Me 
a 
@ 


the rotating coordinate system, the only force along the wire is the centrifugal 
force on the bead, which is mw*x when the bead is at distance x from the 
center (the other two forces are perpendicular to the wire, since both x and v 
are along the wire). So the equation of motion is simply x” = w?x = 6/*x, as 
obtained previously. 


The deflection of a hanging body. Centrifugal force also made an unheralded 
appearance in our discussion of the E6tvés experiment in Addendum 1B, the 
whole point of this experiment being that the proportionality of mass to weight 
amounts to saying that weight acts just like a fictitious force, so we can test the 
proportionality by comparing it to the centrifugal force, which 1s definitely pro- 
portional to mass, since it zs a fictitious force. 


380 Chapter 10 


To calculate the deflection 6 of a hanging body, like that used in the Eétvés 
experiment, we consider the hanging particle x hovering just above the earth’s 


north 





surface, at latitude A, and a rotating coordinate system (U1, U2, U3) at the center 
of the earth defined by letting u3 point toward x and choosing u2 perpendicular 
to it, parallel to the direction of north at x; then uy is parallel to the direction 
of east at x, so that in the picture it points into the plane of the paper. The 
rotation vector of the earth, w = (0,0,w), can be written as 


® =0-u,; +@cosdA:uo+osindA-uz3. 
If R is the radius of the earth, then x = R-u3, so 


@®xXxX=@ x R-u3z=aRcosi-uy 


—m-@ X (@ XX) = —mw’ R(cosA sin dA - uz — cos? A - U3). 
The gravitational force on x is —mg - U3, so the total force on x is 


(1) F = —mg-u3 — m@*R(cosA sind - uz — cos” A+ U3) 


= —mw*RcosdAsinA-u2 — m(g — w* Roos? A) - U3. 


So the downwards acceleration is decreased from g to g — w*R cos” J, result- 
ing in a teensy decrease in weight as we approach the equator, and a small 
southward force 1s added, resulting in a southward deflection @ with 


w*RcosdAsin a 


tan @ = ——___—_—__ 
se g—w*Rcos* i 


Even though R is large, the angular velocity wm = 27/(24- 3600) radians per 
second is so small that w* R/g has the small value w*R/g ~ .00344, so that we 


can write ; 





w 
tan@0 = cosA sind. 


Computations show that for a plumb bob hung from a height of 50 meters at 
latitude 45° (the leaning tower of Pisa is a good approximation), the bob should 


Non-inertial Systems and Fictitious Forces 381 


end up about 8.5cm south, which one might assume is easily measurable on 
a windless day. But there’s actually no “origin” from which to measure this 
distance, since “straight down” is always determined by a plumb bob! What this 
comes down to is that for actual measurements, the direction we choose for u3 
won't really be exactly in the radial direction, and for later use we should really 
write (1) as 


(1") F = mg =mg-us, 


where g = |g| is the acceleration of gravity that we measure at a particular 
spot on earth, and where g points in the direction of a plumb bob, which is the 
direction we will really have ended up choosing for u3. Actually, the bulging 
equator of the rotating earth causes an additional deflection of the plumb bob, 
and this too is taken into account in equation (I’). 


The azimuthal or Euler force. A simple example of the azimuthal force arises 
when you are in a car x accelerating as it goes around a curve, so that in addition 
to @, giving the centrifugal force that seems to push you towards the outside 





— mo’ Xx 


of the car, there is a non-zero w’. This gives the additional force —m@’ x x, 
which seems to push you towards the back of the car, a rotational analogue of 
the translational or acceleration force. 

This force doesn’t play an important role for observations on the rotating 
earth, because the earth’s rotation @ 1s so nearly constant, though as we’ve seen 
in the discussion of the Euler precession (page 339), @ moves along a small cone 
around the line to the North star, so that w’ is a tiny vector pointing inward. 
This might explain the name “Euler force”, or it might have been given simply 
because the whole equation on page 376 had actually been deduced by Euler 
very early on (cf. Persson [l; pg. 15]). One hardly ever sees the Euler force 
mentioned in mechanics problems, but it makes a surprise appearance in the 
final chapter. 


The Coriolis force. Unlike the previous fictitious forces, the Coriolis force on 
a particle depends not on its position but on its velocity. It was introduced by 
Coriolis in a paper of 1835, expanding on one from 1832, both to be found 
in Coriolis [1]. This work originated with the study of water-wheels accord- 
ing to Dugas [1], where a description of Coriolis’ work shows how amazingly 
complicated the origins of a simple idea can be (cf. also page 420). 


382 Chapter 10 


As a simple example of the Coriolis force, consider the case of the rotating 
bead. Here the Coriolis force is perpendicular to the wire, with magnitude 
2ma@|v|. ‘The wire will have to exert the negative of the Coriolis force on the 


_——7V 


a Leen force 


bead in order to keep it moving around, so as the bead moves both further away 
and faster, the wire must be stiffer to resist bending. 

A more involved situation arises when you are on a rotating platform, like 
a carousel, as in (a) of the figure below. If you are facing the center, you will 
have to dig in your heels to prevent yourself from being thrown off by the 
centrifugal force. An additional, Coriolis, force will arise if you try to walk 


(a b) “4 © ” 


towards the center, going from A to B in (b) of the figure, basically because 
you are already moving in a tangential direction, and the point B that you are 
trying to reach is moving less quickly in this direction. You will have to exert 
pressure on the right side of your feet to tack left, giving the impression that 
you are moving against a force to the right, which is the direction of —@ x v, 
as shown in (c); you can feel the effect at one blow by jumping from a wooden 
horse on the carousel to one nearer the center. Also see the amusing movie at 
ww2010.atmos.uiuc.edu/ (Gh) /guides/mtr/fw/gifs/coriolis.mov. 


The deflection of a falling body. One significant consequence of the differ- 
ing tangential speeds at different distances from the center of rotation was 
pointed out by Newton in 1679 in a brief reply to a letter from Hooke (see 
Newton [1]; Vol. 2, pg. 301]). Because of the controversy over the Copernican 
theory, there had been a history of experiments to prove the earth’s rotation 
by dropping objects from a height to observe a supposed westward deflection 
during the fall, the futility of which had already been explained by Galileo 
(cf. Chapter 7). Newton pointed out that not only does an object dropped from 
a tall building already have a horizontal motion east but also, since this horizon- 
tal motion 1s slightly greater than the horizontal motion of the earth below it (as 


East ° West 


Non-inertial Systems and Fictitious Forces 383 


Galileo had also noted), the object should actually end up very slightly to the 
east of that building when it hit the earth, “quite contrary to the opinion of 
the vulgar who think that if the earth moved, heavy bodies in falling would 
be outrun by its parts & fall on the west side of the perpendicular.” Newton 
proposed a means of detecting such a tiny deflection to the east, by comparing 
the distribution of many dropped objects, though Hooke’s trials proved quite 
inconclusive. 

A naive calculation based on Newton’s observation might be the following. 
For a building of height h, the time of descent is very close to 


T = V2h/g, 


the usual answer when we ignore the earth’s turning, If the building 1s at lat- 
itude A, and @ is the angular velocity of the earth, then the body’s horizontal 
motion is greater than that of the earth’s surface by a factor of hw(cos A). So it 


seems it should fall to the east by the amount hw(cosA)T = haw(cosA)/2h/g or 


(E) w(cosA)V¥ 2h3/g. 


This computation actually assumes that the body and the base of the tower 
are each moving in straight lines, rather than circular arcs, but for such short 
distances the difference is presumably negligible. 

On the other hand, there is another effect that we might not notice, and might 
assume was also negligible even if we did notice it. When the falling body has 
reached a position making a (very small) angle 6 with the perpendicular at 
the center of the earth, gravity will produce an acceleration of magnitude g 


along the line to the center of the earth, and this acceleration will have a tiny 
horizontal westward component of 


ACCwest = (cosA)g sin 8 & (cosA)g@ = (cosA)gat. 


So the total westward motion during the descent, from time 0 to time T, will be 


1 l 
(W) (cosA)ga - li = 3 (cos Ajo V 2h3/g, 
and the total eastward deflection should actually be the difference (E) — (W), 


5 (cos A)w VJ 2h3/g. 


This calculation naturally leaves us a little queasy—who knows what else we 
might have ignored!—and a very easy alternate solution is given in Problem 1. 


384 Chapter 10 


It isn’t clear whether any such calculations were even made before the 19% 
century.! The first calculations usually mentioned were made in 1803, when 
the question of the path of a falling object was analyzed by Laplace [1; Vol. 14, 
267-277] and independently by the younger Gauss [1; Vol. 5, 498-503], for the 
benefit of the experimenter Benzenberg, who eventually measured deflections 
in a deep mine shaft. 

Gauss elegantly analyzed the problem by translating the equations of motion 
in an inertial system into equations for a system on the rotating earth, essentially 
a derivation of our basic equation on page 376 for the case of uniform rotation. 
We will continue to use the coordinate system on page 380, and write vectors 
in terms of these coordinates, so that (a,b,c) will stand for a-u,; +b-u2.+¢:-u3. 
Note that, as explained on page 381, the third axis simply points along the 
direction of a plumb bob, whose lowest position is what we use as the point 
from which to measure the deflection. Since we are now dealing with a moving 
particle, we must amend equation (1’) on page 381 to read 


F = (0,0, —mg) + Feoriotis. 
The Coriolis force Fcoricis for the particle (x(t), y(t), z(¢)), with velocity vector 
v(t) = (x'(t), y(t), Z’(t)), is given by 
F coriolis = —2M+@ XV 
= —2ma -(0,cosd, sind) x (x’, y’,z’) 


= —2ma(z' cosA —y’sind, x’sind, —x’cosA), 
so that the particle (x, y, Z) satisfies 


x” = —2@(z’ cosdA — y’ sin A) 
y” = —2ax' sina 
z = 2wx'cosa — g. 

Although we could find useful approximate solutions directly (Problem 2), 
these particular equations can actually be solved exactly. For a dropped object, 
which has the initial conditions 0 = x(0) = y(0) and 0 = x’(0) = y’(0) = z’(0), 
the last two equations give 

y'(t) = —2(sin A)x (t) 
z(t) = 2@(cosA)x(t) — gt, 
and substituting these into the first equation then gives 
x(t) + 4w*x(t) = 2wg(cosA)t, 


' Newton’s letter doesn’t include one, though Arnold [3; pp. 19-20] seems to assume 
that Newton’s observation amounts to the “naive calculation” on the previous page. 


Non-inertial Systems and Fictitious Forces 385 


with the particular solution x(t) = g(cosA)t/(2@), and the general solution 


g cosa 


-t + Asin2wt + Bcos2at. 
2W 


x) = 





The initial condition x(0) = 0 gives B = 0 and then the initial condition 
x'(0) = 0 gives A = —g cosA/(4m7), so that 


gcosA (: sin 2wt ) 








x(t) = 


2M 2W 


Having obtained the correct exact formula, it is now easy to make a useful 
approximation. We have 2wt < 1 for a drop from any reasonable height, so 
we can use the approximation 


(2wt)? 
6 b) 





sin2wt ~ 2wt 

giving the approximate equation 
I 3 
x(t) = 38 o(cos A) “f°. 


Letting T be the total time of descent, T ~ V2h/g, we find that the total 
displacement x(T) in the uy direction, east, 1s approximately 


; w(cosA)V 2h3/¢. 


It might be of interest to note that Gauss, in a letter to Benzenberg, expressed 
great surprise at having found “a deflection to the east only ~ of that which 
Dr. Olbers has found.” The physician Olbers, a good friend of Gauss, made 
several important astronomical discoveries, though he is probably most famous 
for “Olbers’ paradox” that the night sky should be infinitely bright. Perhaps 
his calculation of the deflection was the naive calculation on page 383, without 
the added correction—I have the sneaking suspicion that this correction wasn’t 


noticed until after the calculations of Laplace and Gauss had been made.! 


! Just to make everything a bit more complicated, Gauss’ equations had extra terms 
to account for the change of direction and magnitude of g, but he excluded them as 
being insignificant for an almost vertical fall; note, however, that this refers to the very 
slight variation of direction of g in the rotating coordinate system, due to the fact that 
the object passes over different positions above the not exactly spherical earth. 


386 Chapter 10 


The southward deflection. Since there is a factor of w in the formula for x, and 
thus for x’, our original equation for y” will have a factor of w?, and thus the 
southward deflection will be negligible. But see Addendum B! 

Using the formula for x(t) to get an approximation for x’(t) gives y” & 
—gcosdA sin A(1—cos 2wt) + —2gw? cos A sin A-t?; with initial conditions y(0) = 
y’(0) = 0, we find! that y(T) is (2/3)gh7w? cosA sind. 


Stupid experimenter tricks. A related problem, which also vexed the “vulgar”, 
giving rise to tales of experiments that one hopes are apocryphal, concerns the 
behavior of a cannonball shot directly vertically upwards. If the projectile has 
an initial speed of vo, the only difference in the above calculations is that the 
equation for z’ on page 384 becomes 

z(t) = 2w@(cosA)x(t) — g-t + vo 


leading to i 
x(t) + 4w*x'(t) = 2wg cosa (t — vo/Z), 


with the general solution 
A 
20% g COS 
2w 


The initial condition x(0) = 0 gives B = vo cosA/(2w), and x'(0) = 0 gives 
the same result as before for A, so that 


g cosa ’ Vo g cosa 
2 g Aw? 


The ‘Taylor series for sin 2wt and cos2wt then give the approximate equation 





UV 
(: — 2) + Asin2wt + Bcos2wt. 


Vo cos A 


sin 2mt + cos 2wt. 











x(t) = 


l 
(a) x(t) = 38@cosd-t° — wug cosa: t?. 


The total time from the firing of the projectile to its landing back on earth 1s 
T = 2vo0/g, or T = 2/2h/g if his its maximum height, and we find that 


4 
y= — 3@ cosy 8h" /g. 


Thus, the deflection of the cannonball is to the west—by an extremely tiny 
amount that won’t change the denouement of the experiment. 
Note that the approximation (a) might lead us to the equation 


x'(t) = gweosdA-t(t — 2v0/g), 
so that x/(t) < 0 for t < 2v0/g, which is the entire time of the trip except for 
the last moment, implying that the projectile is heading westward throughout 


! A more careful calculation (with refinements involving terms of order w” that are in- 
significant for the value of the eastward deflection) yields the answer 4gh*w? cos A sin A; 
see Belorizky and Sivardiére [1] for details. 


Non-inertial Systems and Fictitious Forces 387 


the trip, except that x’(2vo9/g) = 0, with the projectile landing exactly vertically. 
Thus, the eastward Coriolis force as the projectile falls from the highest point 
seems to exactly cancel out the westward velocity that has been acquired at 
the highest point. Actually, the computer generated graph of the path shown 
below indicates that the maximum westward position occurs a very short time 






“1100 
maximum 50 
westward ——> 
x 
deflection 





& wa) a a) 
co a) sa) S 
> 2. °o 2 

| | | | 
before the end of the trip, which ends with a tiny eastward velocity. An analytic 
verification from the exact equations might be rather unpleasant. 


Foucault’s pendulum. Coriolis had been a student of Poisson, who soon applied 
the results of Coriolis’ paper to calculate the deflection of artillery shells, which 
turned out to be less than those due to wind and other effects for the artillery of 
that time (though not for modern artillery), and Poisson likewise decided that 
the effect on a pendulum was too small to be observable. But the enterpris- 
ing experimenter Foucault realized that with a spherical pendulum, allowing 
the direction of swing to change with time, one could demonstrate the small 
Coriolis force caused by the rotation of the earth in the way pendulums have 
always been exploited, by observing the cumulative effect of many cycles. He 
eventually created a 67 meter pendulum, with a bob weighing 28 kilograms, in 
the Panthéon in Paris, which created a sensation at the Paris exhibition of 1851, 
where the plane in which the pendulum was swinging could clearly be seen to 
rotate over time, allowing people to “see the earth go round”. 

Of course, it wasn’t so clear just what people were actually seeing, since the 
popular explanation for Foucault’s pendulum was (and still is) that the plane of 
the pendulum’s swing stays fixed, while the earth turns beneath it. ‘This would 
be true for a pendulum at the north pole (a), where the clockwise rotation 
would have the same magnitude w as the earth’s counterclockwise rotation, 
but at the equator, the plane of the pendulum’s swing appears stationary to an 
observer, and if the pendulum is swinging in the north-south direction (b), this 





388 Chapter 10 


plane is rotating in space right along with the earth (the case of an east-west 
swing (c) is usually used instead, to deviously bolster the erroneous idea that the 
plane of the pendulum’s swing is always fixed). 

Foucault recognized that the situation was more complicated, and intuited 
that at latitude A the rate of change of the pendulum’s swing should be sin 1, 
certainly a good guess for a quantity that varies from w to 0 as the latitude 
changes from 90° to 0! But intuitive, basically geometric, arguments for the sin A 
formula have never been entirely convincing—for good reason as we shall see. 

First of all, we need a general idea of what the path of the pendulum bob 
will look like to an observer on the earth. In the figure below, our pendulum 
starts its swing at A. Instead of moving along the diameter (dashed line), it 
will be deflected slightly to the east because of the Coriolis force, arriving at the 


cg 


A 


B 


point B, where its velocity is 0 and it begins to swing back in the other direction. 
This makes the Coriolis force change direction also, so the bob is now deflected 
in the other direction, ending at C. Thus the bob will seem to have rotated 
around the circumference of the circle by an angle ¢ as it goes from A to C. Of 
course, the angle @ has been grossly exaggerated in this figure. Nevertheless, 
in Foucault’s Panthéon demonstration, the distance from A to C was almost a 
centimeter, and the rotation was observable in a short time. 

For the special case where the pendulum’s swing 1s exactly north-south over 
the disk of radius r, as shown in (a) of the figure below, we can determine 
the amount of rotation geometrically.|. The shaded angle in (a) is also A, so 





disk of radius r 





the southernmost point of the disk is further from the axis of the earth’s ro- 
tation than the center of the disk by the amount rsindA. If the center of 
the disk rotates by the very small angle « during one swing of the pendulum, 
then the southernmost point rotates an extra (sin A) - counterclockwise along 


' From Kittel e¢ ad. [1]. 


Non-inertial Systems and Fictitious Forces 389 


the circumference of the disk, so the position of the pendulum bob, as observed 
by some one standing on the earth, looking down at the disk (b), has moved 
clockwise along the circumference by this amount. 

Now one might hope that the rotation is the same for any direction of the 
pendulum’s swing. In fact, textbooks usually show the full path of the pendulum 
bob, seen on the earth, as made up of repetitions of the basic piece of the picture 
on the previous page, just as for the orbits discussed on page 128. But this 





picture can’t be exactly correct, because the Coriolis force, involving the cross- 
product with w = w(0, cosA, sin A), does in fact depend on the direction of the 
pendulum’s swing. 

The details will emerge as a by-product of our analytic computation, which 
actually involves approximations right from the start—not all that surprising, 
since approximations are needed even for an ordinary pendulum. We will use 
the same coordinate system U1, U2, U3 as before, and consider only small oscil- 
lations of the pendulum, of length /, say. If we ignore the Coriolis force, then, 
as on page 291, the coordinates x(t), y(t) of the pendulum bob satisfy 


2 


x” = -—a*x 
se fora = Vg/l. 


When adding in the Coriolis force Feoricis on page 384, we note (aha!) that in 
comparison to x’ and y’, the quantity z’ is small,! so we gleefully hasten to 
discard it, obtaining approximate equations conveniently free of the cos A term, 


x” = —a*x + (2w sin A)y’ 
(*) y” 


= —a7y —(2wsinA)x’. 


‘Thus, cos A, the y-component of the Coriolis force, has a negligible role because 
its effect in the (x, y)-plane depends on the z component of the velocity, which 
is negligible. 


' At any time fo we can rotate our x and y axes far from the direction of the pendulum 
swing, eliminating the problem of x’ or y’ being very small, or even zero. 


390 Chapter 10 


Now we use the trick in the footnote on page 339, custom-made for a coupled 
pair of equations. If we set € = x + iy, we obtain 


C" + i(2w sin dA)e’ +076 = 0, 


which is just the equation for damped oscillations, in this case for a function 
that is complex-valued to begin with. Setting ¢(t) = e?’, this becomes 


op” +i(2w sind)p +a? = 0. 


Since w” < a’, we have 


px —i(wsindA) tia, 


and thus 
c(t) ae ei (@sinA)t . (ae’™ i beiety. 


The second factor describes the motion of the spherical pendulum of Chap- 
ter 8, which will simply be an ordinary pendulum motion if we release it without 
any sidewise motion (Foucault obtained this condition by attaching the bob to 
a stationary point with a thread, and then burning the thread). So we are con- 
sidering only a and b that reduce the second factor to c cosat for a constant c. 
The solution for ¢ then yields approximate equations 


xf) = cos(—(w sin i)t) -c cos(at) 
y(t) = sin(—(@ sin d)t) -c cos(a@t), 


which we might write as 


(") on sin a 
(2 >) = -ccos(at). 


y(t) sin(—(@ sin X)t) 


Since wsinA < a, we have something like the phenomenon of “beats”: the 
pendulum swings with angular velocity a in a plane that is rotating with the 
slow angular velocity —wsinA. This rotation is clockwise because of the minus 
sign (counterclockwise in the southern hemisphere, where @ is replaced by —). 

Because (*) is not exact, the angle by which the pendulum advances actually 
varies ever so slightly from swing to swing; thus, sind is merely an extremely 
good approximation for the factor to be applied to the average of many nearly 
equal advances. 

Although we pictured the path of the Foucault pendulum as having cusps, in 
practice it is extremely hard to guarantee that the second factor of our solu- 
tions (**) will be exactly equal to cos at; even without the Coriolis force acting, 


Non-inertial Systems and Fictitious Forces 391 


the motion of the bob, projected on the horizontal plane, will usually be an 
extremely narrow ellipse rather than degenerating precisely into a straight line 
segment, so when the Coriolis force is added in, the cusps would actually have 
very tiny loops. 


In this regard, it might be mentioned that although Foucault’s pendulum ex- 
periment of 1851 was generally regarded as a more convincing, and certainly 
a more dramatic, demonstration of the earth’s rotation than the experiments 
nearly a half-century earlier measuring the eastward deflection of falling bodies, 
there are many possible sources of error in constructing the pendulum, which 
actually requires great care. It was not until 1879 that these were thoroughly in- 
vestigated, allowing the construction of a Foucault pendulum that did not move 
in an increasingly elliptical fashion. This was the subject of a physics doctoral 
thesis at Groningen, “Nieuwe bewjjzen voor aswenteling der aarde” (New proofs 
of the rotation of the earth), by Heike Kamerlingh Onnes, later renowned as the 
discoverer of superconductivity. An exposition of this impressive work, which 
will also be mentioned in Chapter 22, can be found in Schulz-Dubois [1]. 

Because of the difficulties encountered trying to picture what is really happen- 
ing with his pendulum, Foucault realized, perhaps in discussions with Poinsot, 
who shared his preference for geometric over analytic arguments, that the move- 
ment of the earth could be demonstrated much more directly by a gyroscope, 
whose axis of rotation would stay fixed. ‘The instruments that had been con- 
structed up to that time were inadequate for the purpose, because the friction 
and imprecision of the gimbal bearings created distortions totally masking the 
desired effect. Though Foucault managed to overcome these obstacles, a mi- 
croscope was needed to see the deflection, because he could only get the gyro- 
scope to keep spinning unaided for about ten minutes. Wonderful pictures of 
Foucault’s gyroscope can be found in ‘Tobin [1]; pp. 163, 288]. 

Foucault also invented the gyrocompass, as described at the end of Chapter 9, 
although its eventual usefulness on ships required the contributions of many 
others, especially Hermann Anschitz-Kaempfe, culminating in the Anschiitz- 
gyroscope. In addition, Foucault carried out many other important investiga- 
lions, and even before his famous pendulum demonstration he had performed 
one of the first experiments showing that the speed of light was less in water 
than in air, which was extremely important in the debate about the nature of 
light, as we shall briefly discuss in Chapter 15. Much more about Foucault, and 


392 Chapter 10 


the scientific milieu of his time, can be found in the recent estimable biography 
Tobin [1] mentioned on the previous page; a discussion of Onnes’ work may 
also be found there. 


Hurricanes and bath-tubs. ‘The Coriolis force is of interest to meteorologists be- 
cause, among other things, it is the reason for the rotational winds of hurricanes, 
which form around centers of low air pressure as outer air flows toward the cen- 
ter, as in (a) of the figure below, where the dotted lines are “isobars”, indicating 
areas of constant pressure, and the Coriolis force causes the inward paths to be 
deflected to the right, as in (b), causing the distinctive counterclockwise swirling 


I Cc dee 
4 °° 
aa) 
° 
° 
° 
v7 
. 
, 
° 


° 
° 






' 
' 
1 oe 
° 
Ye 
” 
° 
e 
’ 





associated with hurricanes, with, as usual, many additional complications to the 
general picture. 

The actual calculations, which we leave to the meteorologists, show that the 
very small Coriolis force can really have such an appreciable effect because the 
winds are moving with such high velocities, and for such long distances. 

In theory, and also in fact, hurricanes in the southern hemisphere rotate 
clockwise rather than counterclockwise. Likewise, theoretically water drains out 
of a bath-tub in a counterclockwise direction in the northern hemisphere, but 
clockwise in the southern hemisphere. ‘This can be verified with very careful 
experiments (Shapiro [1] for the northern hemisphere and Trefethen et ail. [1] 
for the southern!), but in practice the direction depends on small effects, such 
as the slight motions that the water already has, which are of much greater 
magnitude than the Coriolis force could produce on water moving so slowly for 
such a short distance. ‘Tourist scams demonstrating the reversal of the draining 
direction as ships pass the equator merely attest to the ease with which the 
direction can be subtly influenced. 

On a larger scale, it might be mentioned (cf. Persson [1; pg. 24]) that at one 
time plans to build a space station rotating rapidly to create an artificial gravity 
had to be abandoned when it was realized that the rotation necessary would 
create Coriolis forces thousands of times stronger than on earth, with all sorts 
of dire consequences. 


| And a multitude of others, cf. Am. 7 Physics, 62 (1994), pg. 1063. 


Non-inertial Systems and Fictitious Forces 393 


On a still larger scale, Addendum A discusses a much more complicated 
example where we get to consider the role of the Coriolis force in a feature of 
the solar system. 


Mach’s Principle. It is, alas, virtually impossible to conclude a discussion of 
rotating coordinate systems without mentioning Mach’s Principle, which, like 
most philosophical principles, seems to be simultaneously both extremely sig- 
nificant for, and totally irrelevant to, everything physicists do. 


Near the beginning of Chapter 7 (cf. page 275), we mentioned the philo- 
sophical problem of determining what one means by velocity and acceleration 
if one hasn’t already decided on a coordinate system, leading Newton to resort 
to the idea of “absolute space”; as we suggested in that chapter, we hope to 
have avoided an appeal to “absolute space” by rephrasing the first law in terms 
implying the exzstence of an inertial system, without the expectation of a way to 
distinguish between all the different inertial systems. 


Although Newton did not suggest any way to distinguish one inertial system 
from another, he pointed out that one could definitely distinguish rotating sys- 
tems from non-rotating ones. He cited in particular the fact that water in a 
rotating bucket will assume a concave shape “(as experience has shown me)”. 
Though not specifically mentioning a parabolic shape, he added that “The rise 
of the water reveals its endeavor to recede from the axis of motion, and from 
such an endeavor one can find out and measure the true and absolute circular 
motion of the water.” A little later he also gave a more direct theoretical exam- 
ple: “if two balls, at a given distance from each other with a cord connecting 
them, were revolving about a common center of gravity, the endeavor of the 
balls to recede from the axis of motion could be known from the tension of 
the cord, and thus the quantity of circular motion could be computed. ... ”, 
similar in some respects to the discussion on page 18. And of course, the bulging 
of the earth at the equator, the deflection of falling bodies, Foucault’s experi- 
ment, etc., are all further examples, and the whole vazson d’étre for this chapter. 


The notion that it is impossible to distinguish between uniform rectilinear 
motion and absolute rest has only been strengthened by later developments in 
physics, especially special relativity. On the other hand, rotation seemed to be 
quite a different story, and philosophers’ rejection of the notion that one could 
determine absolute rotation seemed to be, not so much arguments that it wasn’t 
true, as complaints that it simply couldn’t, or shouldn’t, be true. 


This situation was changed considerably by the physicist Ernst Mach, who 
became interested in questions of perception and thence to a philosophy of 
science rather inimicable to reliance on theoretical concepts. Mach’s analysis 
of the rotating bucket experiment was that it can only reveal that the water is 


394 Chapter 10 


rotating with respect to something else, and this something else must be the 
“fixed stars”, or at any rate the mean position of these stars. As his ideas on this 
question developed, he seemed indeed to be putting forth the proposal, rather 
startling even to himself, that it was somehow the effect of all these stars in 
concert that caused the water to “endeavor to recede from the axis of motion”, 
and that this wouldn’t happen if they weren’t there. 


This notion, though embraced by some physicists, was dismissed with both 
horror and disdain by many others, and attempts have even been made to cite 
experimental results that might make it untenable. For example, the earth is 
located far out on a limb of our galaxy, so that the distribution of the nearby 
mass of the universe around the earth is quite unsymmetric, while numerous 
very delicate experiments have revealed no dependence of inertial mass on the 
direction of acceleration. 

But Machians dismiss such observations as simply showing that it is not the 
influence of nearby matter, but of all the matter in the universe, that is respon- 
sible for these phenomena. Of course, we can’t eliminate all the other matter 
in the universe to test this idea, which has thus been formulated so as to be 
unfalsifiable (which one would have thought ought to perturb the philosophers 
just a bit). And, in a modern rejoinder, Feynman [1; pg. 16-2] has pointed out 
mischievously, if not a bit petulantly, that if rotation 1s really being measured with 
respect to the fixed stars, then why can’t rectilinear motion also be measured 
absolutely with respect to them? 


The whole contretemps might have remained permanently in the shadowy 
demimonde of the philosophy of science, had Mach’s views not attracted the 
attention of a bright young physicist, one Albert Einstein, as he was developing 
his theory of general relativity, very briefly mentioned at the end of Chapter 1. 


Einstein initially felt that general relativity demonstrated the correctness of 
Mach’s ideas, and there 1s a famous letter from Einstein to Mach (reproduced 
in Misner, Thorne, and Wheeler [1; §21.12]), written as Einstein was developing 
the theory of general relativity, in which he says that if the expected verifica- 
tion during an eclipse is made, then Mach’s ideas “will receive brilliant con- 
firmation” (Mach did not reply to this letter, which is not surprising, since he 
didn’t even accept special relativity, or even the reality of atoms and molecules). 
Einstein discussed these ideas in several papers, and in Einstein [2] he stated 
what he called Mach’s principle “because this principle has the significance of 
a generalization of Mach’s requirement that inertia should be derived from an 
interaction of bodies.” Nevertheless, it is generally acknowledged that general 
relativity and Machian ideas have never actually been unified. 


Non-tinertial Systems and Fictitious Forces 395 


The discussion in §21.12 of Misner, ‘Thorne and Wheeler [1], written shortly 
after Kkeynman’s book, practically declares Mach’s ideas to be correct if properly 
understood. But contentious discussions of Mach’s ideas have continued, and 
may well continue forever. For those who just can’t get enough of this sort of 
thing, a good place to begin further reading might be 


Mach’s Principle: From Newton’s Bucket to Quantum Gravity, Barbour, J. and 
Pfister, H. eds., Birkhauser, 1995. 


This volume is based on a conference held in 1993, perhaps the first ever devoted 
exclusively to Mach’s Principle. Being fairly recent, it may be close to the 
latest word on Mach’s Principle (though, given the philosophical nature of the 
question, undoubtedly not the dast/). 


396 Chapter 10 


ADDENDUM 10A 
THE TROJAN ASTEROIDS 


In Chapter 4 we easily progressed from the one-body problem to the two-body 
problem, and it might be thought that the three-body problem would require 
just a bit more effort. As the old adage says, however, two’s company, three’s a 
crowd. The three-body problem exhibits almost all of the intractability of the 
more general N-body problem. This circumstance led to the investigation of 
many special cases of the three-body problem where one could at least get some 
results, even if these results seemed unlikely to have any application. 


The restricted three-body problem. In particular, consider the case where one 
of the bodies has a negligible mass m compared to the other two, b; and bo, 
with masses m1 > mz; it is usually assumed also that by and bz move in a plane, 
and are actually in circular orbits about their center of mass C. 


b, C b> 


Problem 4-10 shows that b; and bz rotate about C with an angular velocity w 
given by 
wa? = G(m, + m2), 
where a is the distance between b; and b2. One could also deduce this from 
the equations (page 121) for circular motion: if d; = |b; — C|, then 


Gm\m> 
= m,d,w* = md". 





op 
Since the force on the third body depends on its distances from b; and bo, 
the easiest coordinate system to work with is the rotating one in which they are 
fixed. We choose the line from b,(¢) to b2(t) as the rotating x-axis, with C 
at (0,0,0), so that b,(t) has coordinates (—d,,0,0) and b2(t) has coordinates 
(dz, 0,0), where 
Ce a, dy —a a 

m; +mM2 
Then w = (0,0,@) and for the third particle c(t) = (c1(¢), c(t), c3(t)), we 
have 


m;, + m2 


centrifigual force = +m - w*(cj(t), c2(t), 0) 


Coriolis force = +2m - w(c2'(t), —cy/(t), 0), 


The Trojan Asteroids 397 


so if we let 7;(t) = |c(t) — b;|, we have 


(*)  (er(t), 2"(2), €3"(0)) = —G mS i mae 


V2 
w? - (cy (t), C2(t), 0) + 2@ + (c2'(t), —c1'(t), 0). 


We are not going to try to solve even this simplified equation in general. We 
will merely look for the special case of a stationary orbit, in other words we 
seek a solution with c;(t) = ¢; some constants ¢;, which means that the orbit is 
stationary 7 our rotating coordinate system. Since we then have 0 = c’(t) = c(t), 
the components of equation (*) give 


(a) 0=-G Fac + d\)+ 2 (8; - as)| +e 
ry r2 
my m2 oO 
" o= (0? [TS + |) & 
my m2 ° 
(c) 0=-G aed 


Equation (c) immediately gives ¢3 = 0, so the stationary orbits must be in the 
plane in which b, and bz revolve. 

Equation (b) will automatically be satisfied if cz = 0, ie., for a stationary 
orbit that lies along the line between b, and b2; for such “collinear” solutions, 
the only condition is that given by (a). Suppose first that cy is between —d, 
and d2, where d; + d2 = a, and choose A with rp = do — ¢, = Aa, and thus 
ry = (1—A)a. 


<q) 0 C1 d> —d, 0 C} d> 
$tt t H 3p tt 
EE ————  —— a po em | tian amar 
G (1—A)a Aa 
Since 


Cy = d,—iAa= Cera 


m,+m2 


sasha | + Ww? (= -) a, 
m, +m 


equation (a) becomes 





0=-6| salle + 5 


398 Chapter 10. Addendum 10A 


and since 


G a? 


w2 my+tmy’ 
this reduces to 
my ma 

oi le TE MA a Seg 
We have f’(A) < 0 for 0 <A < land f(A) > co asd > Oand f(A) > —co 
as A — 1, so there is a unique A in (0,1) satisfying this equation, and thus a 
unique C; between —d,; and dz. Similar arguments show that there is also a 
unique Cy; < —d; and a unique Cc > do, so there are exactly 3 stationary orbits 
on the line between c,; and c2, discovered by Euler in 1767. 

Any other stationary orbits, “triangular” orbits, must satisfy 


my mg 
b’ *_G|—- ++] =0 
(b’) wc ma | , 


while (a) can be rewritten in the form 


: , my m2 . mim? I I 
ey (acces a At) Wee Md (a IP 
(a) (« E 7 ml) a my ma | 3 


and (b’) then implies that 
ry = 172. 


Thus (b’) simply becomes 


Pe a ii Nas BLE) 


r1> : 


and using the relation at the top of the page, this shows that 7) = r2 = a. Thus 
b,,b2,c form an equuateral triangle. ‘These triangular orbits were discovered 
by Lagrange in 1733. When m, >> mz and the orbit of bz 1s essentially a circle 
around by, the orbit of c is on that same circular orbit. It is customary to label 
the one preceding bz in its orbit as L4 and the one following as L5, with Euler’s 
collinear stationary orbits numbered Lj, L2, and L3 (though there is no fixed 
convention for the numbering of these three). L1,..., 25 are called “libration” 
points by astronomers, although all five are often called Lagrange poits. 





The Trojan Asteroids 399 


Stability. Even allowing for the various idealizations in our analysis, like the 
assumption that c has negligible mass, the foregoing results require further con- 
sideration. Although we have found five mathematically stationary orbits, only 
stable ones are of interest; the tiniest asymmetry of the body at the orbit, or 
even a passing bit of cosmic dust, will change the orbit by a tiny amount, and 
we need to know that this will not cause the body to start wandering far away. 

To get some idea about the stability of the stationary orbits, consider the 
contour plot below for the potential function V associated with the vector field 





indicated by the first two terms on the right of (*); we have to omit the Coriolis 
force, which is a function of velocity, not position, but at least we know that 
this is 0 for stationary orbits. The potential V approaches —oo around b, 
and bz, while the Lagrange points, where the combined gravitational forces 
of b; and bz just cancel the centrifugal force, are critical pomts for V. If a 
body at a collmear Lagrange point 1s displaced horizontally, it will move into a 
region where V has a smaller value, and thus continue to move away from its 
position, so we would expect these Lagrange points to be unstable. Of course, 
we've ignored the Coriolis force, which comes into play as soon as the particle 
near the Lagrange point starts to move, so this argument needs to be verified by 
calculations. As we will mention later, the calculations show that these points 
really are unstable, implying that the Coriolis force doesn’t materially change 
the situation. 

These calculations will also show that in the case of the triangular Lagrange 
points L4 and Ls, the Coriolis force does make a decisive difference: as c moves 
away from L, or Ls, “rolling down the potential hill”, the Coriolis force deflects 
it, Causing it to spiral, and when m; > mz the spiraling 1s sufficiently fast to 
insure that it stays in the vicinity of L4 or Ls. Of course, one need not invoke 
the Coriolis force for this explanation; it can be rephrased directly in terms of 
the rotation of the Lagrange point around the center of mass of b, and bz. 


400 Chapter 10. Addendum 10A 


Though this may seem like a lot of analysis for a special instance of the three- 
body problem that was concocted simply to produce a solvable solution, that all 
changed in 1906, when the intrepid asteroid hunter Max Wolf found his 588" 
asteroid, 588 Achilles, at L4 for the Sun-Jupiter system. The asteroid Patroclus 
was then found at Ls5, and Hektor at L4. After that, the next asteroids Nestor 
(L4), Priamus (L5), Agamemnon (L4), Odysseus (L4), Aneas (L5), ..., were 
always given the names of Greeks in the Trojan war if they were near L4 and 
the names of ‘Trojans if near Ls, so that only Patroclus and Hektor are in the 
wrong camps, and the combined groups were called the ‘Trojan asteroids. 

The web site www. kw.igs.net/~jackord/bp/f8.html1 shows plots of the 
path of an asteroid starting very close to L4 for various ratios m;,/m. The 
figure below shows the asteroid falling into the heavier body b; when the ratio 





is 9, and into bz when the ratio is 19. By the time we get to a ratio of 99, which 
includes the Sun-Jupiter system, we have stability, with the path staying very 
close to L4. 

Periodic orbits near the Lagrange points have also been studied (see page 404 
for a reference). In the Sun-Jupiter case, there are “stable orbits”, closed curves 
that rotate about the sun in sync with Jupiter, one of which is also shown in the 
above web site. In fact there are actually many asteroids of this sort near Lg 
and Ls, which continue to be discovered at an alarming rate, with over 1,000 at 
each point, and Greek and ‘Trojan names have been exhausted long ago. ‘The 
rough picture below is based on a diagram that can be found in the web site 
www.dtm.ciw.edu/sheppard/satellites/trojan.html, which has a contin- 
ually updated count of the known ‘Trojans. 


Jupiter 





The Trojan Asteroids 401 


It also turns out that there are “Mars Trojans” at both the L4 and Ls pomts of 
the Sun-Mars system, as well as Neptune Trojans. In addition, although there 
are no known Earth Trojans, there are some very strange asteroid companions. 
The web site math.ucr.edu/home/baez/lagrange .html1, which covers many 
topics about ‘Trojans, give references to other sites where these strange asteroid 
companions of earth are discussed, together with animated pictures of their 
orbits. 


Stability calculations. In Addendum 6C we described stability criteria for first 
order equations. A second order equation 


(c1,€2,¢3)"(t) = F(ci(t), co(t), c3(t), c1 (t), c2’(t), 3 (£)) 


is handled in the usual way, by introducing new variables v1, v2, v3 and writing 
the equation as 


(C1, C2, C3, Vp, V2, U3)’ = (Vq, V2, V3, F(C1, C2, C3, V2, V2, V3)). 


Applying this to our second order equation 


c(t) — = 


(#)  (c1"(@), c2"(@), €3" (1) = -G fm, +m2.—— 
ry V2 


+ w? - (cy(t), C2(t), 0) + 2@ - (c2’(t), —c1'(t), 0), 


we will now be considering a 6 x 6 Jacobian matrix made up of a 3 x 3 matrix 
with all entries 0, a 3 x 3 identity matrix, and two other matrices J; and Jz, the 
second of which, coming from the term 2@ - (c2’(t), —c1’(t), 0), is simply 


0 —-l 0 
0 0 0 


As for Jj, since 


? 


Ory = OV (X +01)? +y*? +27 — x 4+) 
=a ail ae 


Ox rj 
the derivatives with respect to x of the components of 


(x,y,z) —by = (x + d1, y,Z) 


r1> r1> 


are 
le 3(x + d,)? —3(x + d;)y —3(x + d,)z 


402 Chapter 10. Addendum 10A 


and similarly for the other partials. Remembering the w?-(c;(¢), c2(t), 0) term, 
the complete expression for J;(x, y,Z) will be 


1 0 0 7 1 0 0 ; 
w*10 1 O -~G 0 1 0 — (x + dy, y,z)' +(x + di, y,z) 
0 0 0 rl 00 1 rl 
. 1 0 0 : 
—~G][0 1 0] —- = -4,y,z)'-(e- do, y,z) |, 
"2 00 1 M2 


here written using terms like the product of (x + di, y, z) with its transpose on 
the left (also called the outer product of (x + dj, y, Z) with itself): explicitly, the 
whole first bracketed term is 


3(x + d,)? 3y? 3z° 
r12 r 12 r 12 
3(x + di)y 3y? 3zy 
a oa re ae 
ry ry ry 
3(x + d,)z 3yz 32° 
ry r\2 r\2 


with a similar expression for the other term. 
When our point (x, y,z) 1s L4 or Ls5 we have 


bahay = = (1, £-V3, 0) 


Ce ee =(-1, Pe). 





where +V3 applies for L4 and —/3 for Ls. Remembering that 7) = r2 = a 
and that wa? = G(m, + m2), our first matrix comes down to 

a 1 +V3u 0 

Ai (%, 9,2) = +/3 3 0 

0 0 -% 


my, —mM2 


mm, +m? 


2 


Taking the corresponding approximating first order equations in 6 variables, 
and putting them back into the form of a second order approximating equation 


The Trojan Asteroids 403 


for ¢ = (C1, C2,¢3)*, we get 


1 +VJ3p 0 | 
307 
Coa +/3 pp 3 0 Jle+2m]1 0 OO fe’ =0, 
0 0 -% 0 


which separates into a system for the first two components and the single equa- 
tion 
C3" + wWc3 =): 

The latter is sumply the harmonic oscillator, with solutions that are periodic 
functions, indicating that small changes in the z direction, off of the plane con- 
taining b, and by, will not lead to instability. 

Ignoring this third component, and considering ¢ = (c1,c2)*, we then have 
the system 


sie (‘ we a I — ; 
Cc (69) i ere c = UV, 
1 0 4 \4/3 pu 3 


for which we seek a solution of the form 


c(t) = e™ os 


This leads to the equations 


a? — Su? —w (20 a 343 wy) ay 
— 0, 
(20 F 23 oy) a? — 2@? a2 


which have non-trivial solutions only if the determinant of the matrix is zero. 
Hence we need 


(a) o* + wa? + 21 — 7) = 0, 


v= -S (14 Ji- Faw), 


Since any solution a@ will also give the solution —q, if we have any solution 
that is not purely imaginary, then we will definitely have one with a real positive 
part, so our stationary orbit will definitely not be stable. So we obviously need 


or 


21 2 
022 et, 


404 Chapter 10. Addendum 10A 


Since mM; => m2 implies that 0 < yw < 1, to have only purely imaginary solutions 
we need Pers 
p> > 23 =» —*_—* > 9229582. 
m, + m2 
This holds for the Sun and a planet. For example, the mass of the Sun is 1047.35 


times the mass of Jupiter, so for the Sun and Jupiter we have 


(eee 
pe = ——147 = 9980922. 


I+ 1047.35 


Of course, we have simply found the condition that the characteristic equa- 
tion of our matrix has purely imaginary roots, and this is the critical case in 
which we can’t determine stability for sure. Proving stability for this particu- 
lar case is an extremely difficult mathematical problem, which was only solved 
(nearly completely) in the 1960’s, though physicists presumably weren’t particu- 
larly concerned with these mathematical niceties. For references, see Abraham 
and Marsden [1]; end of §10.2]; also see §10.3 of that book for a discussion of 
closed orbits around the Lagrange points. 


The collinear Lagrange points. Although we will not carry out the calculations, 
it turns out that the collimear Lagrange points are all unstable (Abraham and 
Marsden [1], Beutler [1], and Boccaletti and Pucacco [1]). 

In the case of the Sun-Earth system, although the collinear Lagrange points 
are unstable, L; and L2 are promising positions for artificial satellites, which 
can be kept in stable orbit by rockets, which they have to have anyway. In fact, 
SOHO, the Solar and Heliospheric Observatory, is at L1, with an unobstructed 
view of the Sun, and WMAP, the Wilkinson Microwave Anisotropy Probe is 
at Lo. 

As for L3, always on the other side of the Sun from Earth, it has starred 
in various science fiction stories and movies as the hidden “Planet-X”’, ideally 
situated to attack Earth. With an instability on a time scale of 150 years, it isn’t 
a very good candidate for a planet, though it might make a good temporary 
launching field for an attack on Earth. 


The Southward Deflection 405 


ADDENDUM 10B 
THE SOUTHWARD DEFLECTION 


In the words of an accomplished experimental physicist, Hall [1], the question 
of a southward deflection of a falling body “has been answered in the negative, 
on theoretical grounds, by Gauss and by Laplace, and in the positive, on exper- 
imental grounds, by nearly every one of the investigators who have from time 
to time through more than two centuries made the actual trial.” 

In fact, the calculations by Laplace and Gauss were made after Benzenberg’s 
first experiments in 1802, dropping balls from the tower of St. Michael’s in 
Hamburg, and in these experiments Benzenberg had found both an eastward 
deflection and a southward deflection. For the later experiments of 1804, in a 
mine shaft in Schlebusch, Benzenberg dropped 40 balls, and then selected the 
drops that he considered to have been made under the most favorable condi- 
tions. The results then gave an eastward deflection close to the 8.8 mm predicted 
by Laplace’s and Gauss’s analyses, while the southward deflection now seemed 
to have become negligible. 

Hall concludes, “The honesty of Benzenberg 1s not to be questioned. But we 
may well ask how far he may have been influenced, in ‘selecting’ his evidence 
from the whole body of data, by the knowledge that the authority of Gauss and 
of Laplace was dead against the southerly direction.” Indeed, Hall allows that 
such biases might be what a careful investigation of this strange discrepancy be- 
tween theory and experiment would ultimately reveal: “If the whole mystery 1s 
the consequence of mental bias in the experimenters, the proof and explanation 
of this bias would have, at least, the merit of psychological interest.” 

The second paper by Hall in the same journal is devoted to Hall’s own exper- 
iments. They too detected a southward deflection, but it was within the limits 
of experimental error for his particular setup, which had the advantage of being 
carried out in a specially constructed enclosed tower, but the disadvantage of 
involving a much shorter fall. 

Among the various possible explanations given for an actual southward de- 
flection, the influence of the bulging of the earth near the equator has seemed 
the most promising, but there are apparently still disputes about the matter. See 
French [1] for references. 

Hall’s paper mentions the many experimental difficulties mvolved in exper- 
iments of this sort and considering how many there are, it’s something of a 
wonder that reasonable results have ever been obtained. Aside from obvious 
problems like air resistance, one of the most delicate points 1s assuring that the 
objects are being dropped without having any sidewards motion imparted. Near 


406 Chapter 10. Addendum 10B 


the end of his paper, Hall mentions one other factor—the reliability of plumb 
bobs—that I have not seen considered in any other investigations of this ques- 
tion. In particular, our calculation of the deflection of the hanging pendulum 
bob on page 380 essentially assumes that it hangs on a weightless filament, but 
in practice there will be considerable weight distributed along the supporting 
wire, so the vector g in equation (I’), which plays a crucial role in our analysis 
of the equations for a falling body, won’t really be indicated accurately by an 
actual plumb line. 

I do not know of any calculations trying to determine the difference that this 
makes, but it could conceivably explain the small southern deflections obtained, 
as arising from making measurements from an incorrect base point. 


Non-inertial Systems and Fictitious Forces 407 


PROBLEMS 


1 Let R be the radius of the carth (thought of as a perfect sphere) and h the 
height from which an object falls at the equator (at the end, all equations can be 
multiplied by cos A for the result at latitude A). For convenience we also assume 
the object has mass m = 1. Let w be the angular velocity of the earth. 


(a) At the time of release, the speed of the object is v = w(R +h), and the 
angular momentum has magnitude 


v(R+h) =o@(R+h)?. 
Letting w(x) be the angular velocity of the object after it has fallen a vertical 
distance x = 5 gt? [so that w = w(0)], show that 

o(R+h)* = a(x)(R+h—x)’, 

so that 

Bee w(R +h)? 
— (R+h—x)? 
(b) For h < R we have approximately 

w(x) = wo(1 + 2x/R) = w(1 + gt?/R). 
(c) If the time of descent is 7, then the total angular displacement is 
wgT? 
3R ° 

and the displacement measured along the earth’s surface is R times this. 


(d) Noting that the foot of the tower has an angular displacement of wT, con- 
clude that the displacement from the foot of the tower is 


wl + 





] 
_weT?. 
3 g 


2. (a) The easiest way to handle the equations on page 384 is simply to ignore 
the quantities x’ and y’, since these are surely going to be very small, obtaining 


x" = —2wz' cosa 
ae ap, 
Using the initial conditions, obtain z’ = —gt, substitute into the first equation, 
and obtain immediately 


| 
x(t) = 3ogt cos A. 
(b) ‘Treat the cannon problem in the same way. 


' From Reddingius [1]. 


408 Chapter 10 


3. Show that the volume of a segment of a paraboloid of revolution is half 
the volume of the circumscribing cylinder, of radius a, say. So if a liquid in a 





rotating cylinder rises to a height h above its initial height, the depth below the 
initial height is also h. Conclude that the angular velocity w of rotation satisfies 


> 4gh 


If the wheels of a vehicle are attached by gears, at a known ratio, to an upright 
cylinder filled with liquid inside the vehicle, this allows us to use the cylinder 
as a speedometer. Such a mechanism, with oil as the liquid, was actually once 
used for trains. 


CHAPTER 11 


FRICTION, 
FRIEND AND FOE 


Fk riction is generally regarded as an unwelcome intruder in theoretical me- 
chanics, where it is usually banished by fiat in elementary problems, or, 
as in the discussion of rolling in Chapter 6, cleverly given a figurehead role 
without any actual influence. To be sure, the very first problem in this book 
acknowledged its importance, and a footnote in Chapter 9 implied that it might 
behoove us to consider it in more detail. 


The laws of friction. ‘Tribology, the study of friction (from Greek toi8@ = rub), 
nowadays a specialty, was originally just a side-line for researchers. ‘The earliest 
such investigations, aside from those of Leonardo da Vinci, unearthed only 
later, were made by Guillaume Amontons (1663-1705), mainly noted for his 
improvements to thermometers, barometers, and hygrometers. Amontons’ first 


os FR AANA MAI AA mame 


APUMRALENT UT ce DANTE 1570 
5 BI 





law of friction states that the frictional force of one object sliding on another 1s 
proportional to the applied load. ‘Though da Vinci measured the friction of a 
block sliding on an inclined plane, the design of Amontons’ mechanism suggests 
that he did not. In any case, nowadays we are careful to say that friction is 
proportional to the normal (component of the) force, as mentioned in Chapter 6. 

The usual, somewhat mystifying, statement of Amontons’ second law, that 
the force depends only on the load, not on the amount of surface area where 





the objects are in contact, refers to a single body; for example, the block in the 
above figure will have the same friction with the surface on which it slides in 
all three positions. It appears that this conclusion, which had also been made 


409 


410 Chapter 11 


by da Vinci, was always considered very suspect, since friction seemed to be a 
surface phenomenon. 

In this regard, mention is sometimes made of J. ‘I’ Desaguliers (1683-1744), 
an experimenter in many fields, who in 1725 demonstrated a force of adhesion 
between two balls of lead (what we would now call molecular adhesion), and 
suggested, rather far-sightedly, that this might have something to do with fric- 
tion, though this proposal had to contend with the fact that the force of adhesion 
is not independent of the contact area, but proportional to it. 

The prolific Euler (1707-1783) performed his own analysis and experiments 
on friction, and noted that for a block of steel on an inclined steel plane, when 
the inclination reaches a critical angle the block starts moving, but only very 
slowly, while in the case of wood on wood the block starts moving relatively fast. 
This lead him to distinguish between “kinetic friction”, the frictional force when 
the object is moving, and “static friction”, the frictional force opposing motion 
when an applied force doesn’t cause motion, with kinetic friction less than or 
equal to static friction, and he is generally credited with being the first to make 
this distinction. 

Coulomb (1736-1806) is the person always sure to be mentioned in connec- 
tion with friction, though he is much more famous for his law of electrostatic 
forces. He became interested in friction through Amontons’ work, and care- 
fully verified that friction was independent of the contact area. Coulomb not 
only distinguished between static friction and kinetic friction, but also noted 
that static friction often increased when the block and the surface remained in 
contact for a long time, even providing an empirical formula for the rate of 
increase. He stated several rather detailed rules (cf. Meyer [1; pg. 9]) and nowa- 
days special attention is given to “sliding friction” satisfying Coulomb’s law that 
it is independent of speed, though Coulomb’s rule was not quite so definitive. 
Coulomb’s experiments were later extended in scope by Morin (1795-1880), 
who sometimes appears as the third musketeer of French friction researchers. 

Coulomb attributed friction to the roughness of the surfaces, and the effort 
needed to slide the protruding humps over each other, which would help explain 


LARRAA 
ARORA 


Amontons’ second law, since both the surface area and the pressure would be 
involved. Nowadays, this explanation is cavalierly dismissed on the grounds that 
the work expended moving up the humps is retrieved as the block slides back 
down under the normal force. Indeed, this criticism would seem reasonable for 
a regular array of humps as in the figure, but might not seem so applicable to a 
more random distribution of humps of various sizes. In any case, the explana- 
tion seemed to conflict with some of Morin’s experiments, and was later seen to 


Fnction, Friend and Foe 411 


conflict with other evidence. For example, it turned out that friction between 
surfaces was often lower when one was significantly rougher than the other, not 
to mention the fact that highly polished surfaces might exhibit zncreased friction, 
as mentioned on page 216, which fit in nicely with the ideas of Desaguliers. 

A resolution seemed imminent around 1950, when the contact area was ex- 
amined more carefully. Because of microscopic irregularities, the actual contact 
area of two surfaces is much smaller than the apparent macroscopic area, and, 
most importantly, an increase in the normal load pushes these irregularities 
closer together, so that they overlap more, and even flattens some, thus increas- 
ing the contact area. This lent support to the idea that friction does result pri- 
marily from molecular adhesion, and one can actually observe tiny fragments 
of the surfaces being worn away because of this adhesion force. 

The picture was not destined to remain so simple, however, since it was later 
shown that in some cases, as with the ultra-smooth surfaces of a cleaved piece 
of mica, there is friction even though there is no wear whatsoever, and the 
modern picture of friction invokes waves in the atomic lattice generated by the 
protrusions being deformed, described in some detail in Krim [1], and bricfly 
in Feynman [1; pg. 12-4]. 

This entire discussion involves “sliding friction”, and there are whole other 
areas that we haven’t even begun to consider, like rolling friction, which occurs 
when real, rather than idealized, bodies roll; viscous friction, which is not inde- 
pendent of velocity, but proportional to it (cf: page 295); the use of lubricants; 
etc., and we will studiously continue to ignore them. 


So where does this all leave us? Basically we will be considering only the sim- 
plified laws of Coulomb friction, sometimes called Amontons-Coulomb friction, 
and even sometimes Coulomb-Morin friction: 


Consider a fixed planar surface, and a body in contact with it, with the total 
force on this body decomposed as N + F, where N is normal to the surface, 
with magnitude N, and F parallel to it, with magnitude F. 


Then there is a critical value ws, the static frction, and a number p with 
O< p< ps, the coefficient of (kinetic) friction, so that 
for F < ws -N, there is no motion, 


for F > ws N there is motion in the direction of F, and the body acts 
as if it is under the influence of a force of magnitude F — WN, 1.e., there 
is now an additional “frictional force” of magnitude LN. 


Note that the stipulation 4 => 0 implies that the frictional force is always in the 
opposite direction from F. And since p is simply a number, it is implied that 
this frictional force is independent of the speed of motion. 


412 Chapter 11 


Of course, this is a purely empirical law, and like almost every empirical law, it 
is hedged in on all sides with restrictions and exceptions. Nevertheless, the very 
basic idea, that friction 1s proportional to the normal load, or sometimes merely 
the fact that friction exists, and, importantly, is always in the opposite direction 
of motion, helps explain, or is even crucial to explain, certain phenomena. 
Before discussing some complex examples (a couple of simple examples are 
given in the Problems), we ought to examine the problems that the Amontons- 
Coulomb laws of friction present on purely logical grounds. 


The Painleve paradoxes. In machinery with parts moving against each other, 
it often happens, even when the designers have taken friction into account to 
estimate the amount of force that needs to be applied, that the machinery may 
stop sliding smoothly, moving in starts and stops, or vibrating, or even locking 
up in certain positions. In a more mundane example, those of us of a certain 
age, who once wrote with chalk on blackboards, know that when a hard piece 
of chalk 1s pushed on the blackboard at an angle nearly perpendicular to it, the 
chalk will sometimes start skipping erratically instead of sliding, resulting in a 
dashed or dotted line. One might imagine various reasons for these effects, but 
the work of Painlevé [2], best known for his mathematical work, first brought 
attention to the fact that they are to be expected because of some inconsisten- 
cies between the basic laws of mechanics and the laws of Amontons-Coulomb 
friction. ! 

Modern presentations of the Painlevé paradoxes usually begin with a simple 
example called the Painlevé-Klein problem. Consider two carts moving along 





parallel guides, connected by a rigid rod, with ¢ the angle determined by the 
length of the rod and the distance between the guides. Cart A is dragged along 
at a constant distance behind cart B, which is supposed to slide frictionlessly, as 
indicated by the little wheels or ball bearings, while cart A slides with coefficient 
of friction jz. For the simplest case we assume that the two carts each have mass 


| This work was somewhat anticipated by Jellett [1], who will reappear in a later section. 


Friction, Friend and Foe 413 


m = 1, while the connecting rod has negligible mass, and that a force in the 
direction of the guides, with magnitude F, is applied at B. The normal force 
of magnitude N on A now arises from the tension force of magnitude T along 
the rod, with 

N=Tsmn@. 


Note that N and T may be positive or negative. As usual, we let signa be +1 
for a > 0, and —1 for a < 0, and 0 for a = 0. 

The total horizontal force acting on B 1s F —T cos @, so if x is the coordinate 
of B we have 


N 
tang 





(a) x" = F—Tcos¢ = F — 


On the other hand, the whole system, consisting of the two connected carts, 
satisfies 


(b) 2x" = F — p|N| sign x’ 
= F — uN sign N sign x’, 


and equations (a) and (b) yield 
_ F tang 
~ 2—yptand¢ sign N sign x’ 
Now suppose that pw tang > 2. 


If we seek a solution with x’ > 0, we immediately obtain a contradiction 
for either value of sign N. 


For x’ < 0 the solution is not unique, for we have both 


Ft Ft 
fa SoG. ee. We ee 
2+ pwtand 2—ptandg 


If instead we have pw tan@ < 2, then we obtain the unique solution 


F tan @¢ 6) , &U—ptand sign x’) 
SS F De Se ee 
2— ptan @ sign x’ 2—ptand® sign x’ 
As long as F is bounded, N will stay bounded also, so if the system at least starts 
in a reasonable state it can never reach a singular position where pz tan ¢@ = 2. 
Nevertheless, not everything is honky-dory. ‘The most interesting property of 


414 Chapter I1 


the system involves its motion starting at rest, x’ = 0. The frictional force on A 


has magnitude 
Ft ion x’ 
LN sign x = Pewee cae 
2—ptan®@ sign x’ 
while the horizontal force on A due to the force applied at B is 
F 


T cos¢ = — = ———___-. 
tan@ 2—ptan@sign x’ 
In order for A to start moving, the second must be greater than the first, 


F . uF tan ¢ sign x’ 
2—ptangdsigenx’ 2—ptandsign x’ 


When F > 0, so that B is being pushed to the right, this implies 
yitan d sign x’ < 1. 
In order for B to move right, with sign x’ = 1, we thus must have 
tang <1. 


Hence, if there is any friction at all, w > 0, then for a large enough initial angle @ 
it will be impossible to start, and the system is “wedged”, or self-braking. 


The standard Painlevé example, which can serve as a model of the chalk 
phenomenon, involves a uniform rod of mass | and length 2/, which 1s sliding 
on a horizontal surface, with a coefficient of friction fw. We will let x(t) be the 





x-coordinate at time ¢ of the bottom end of the rod, ¢(t) the angle that the rod 
makes with the horizontal, and A(t) the height of the center of the rod. Finally, 
let y(t) be the y-coordinate of the bottom end of the rod—since we do not rule 
out the possibility that the rod will eventually rise above the surface—with 


(a) =y+t+lsin¢g. 


The magnitude N(t) of the normal force of the rod on the surface at time f 
is also the magnitude of the force exerted by the surface upward on the rod at 


Frnction, Friend and Foe 415 


time ¢t, so we have 


(b) x" = —uN 
(c) hY =-g+t+QN. 
Finally, the moment of inertia of the rod about the axis passing through the 
center and perpendicular to the plane of the figure is $/? (cf. Problem 5-6). So 
equation (b) gives 

417” = I(u sing —cos@)N, 


or 
(d) o” = 3N(usin¢ —cos¢)/1. 
Now (a) gives 

h" = y" + 1(" cos¢ — ¢’” sing), 


and when we substitute in from (c) and (d) we get 


y” = [1+ 3cosd(cos¢? — sing]: N + [/6’? sing — g] 
=A-N +), say. 
Since the force N arises from the contact of the surface with the rod, which 
cannot go below the surface, we have two additional conditions: y”(t) > 0 => 
N(t)=Oand N(t)>0 => y(t) = 0. 
When A > 0, it is easy to find the solution for b # 0: 
For b > 0, we have y” > 0, so N = 0 and thus y” = b. 
For b < 0, we can’t have y” > 0, for then we would have N = 0, and 
thus y” = b <0. So y” = 0, and thus N = —b/A. 
When A < 9, on the other hand, things are quite different: 
If b > 0, the solution is not unique; in fact, we have the two solutions 
given previously, 
y” =band N =0, 
y” =Oand N = —)D/A. 


If b < 0 then there is no solution, since we would have y” < 0. 


But the condition A < 0 can certainly occur, simply by making jz large 
enough. More specifically, we have A = 0 for 


1+3cos*¢ Ll, 1 
che ee hs ae) t 
LS art ae 3 (sin @ cos) + cot d, 


416 Chapter 11 


and the minimum value of this for all ¢ is for @ = arctan 2, with yw having the 


value! pp = <. Moreover, given pL => , we have A = 0 for 
3usin dcos¢@ = 1 + 3cos* g, 
or \ 
3utang = —~— +3 => tan’*¢d—-3yutandt+4=0, 


cos’ 


so we have A < 0 for any ¢@ between 


(24 ea) (et vets) 
arctan | ———____—__—_- and arctan a : 


2 


This is summed up in the following figure, based on Génot and Brogliato [I], 





where a detailed analysis is given to show that there are solutions that reach 
the two singular points indicated by dots, at which existence and uniqueness of 
solutions no longer holds. At these two points we have A = 0 and b = 0, which 
would lead us to expect that y” = 0. But in fact there are solutions where 
we have y” # 0, which turns out to be possible because as we approach these 
points, the normal load N becomes infinite. 


| This is quite large, coefficients of friction usually being considerably less than 1, but 
more realistic models of a piece of chalk, as a thin cylinder with rounded edges, give 
more realistic values for [. 


Friction, Friend and Foe 417 


Of course, none of these paradoxical phenomena pose any threat to our ba- 
sic understanding of theoretical mechanics, depending as they do on empirical 
laws that we know to be only approximate, treating situations that would be- 
come hopelessly complicated to describe accurately in detail. But they present 
great challenges for the actual design of machinery, and form an important 
area of research in applied, or engineering, mechanics, where the goal is to ob- 
tain reasonable models that can handle these situations. ‘These models, which 
may involve considerably different approaches and assumptions, often depend 
on rather complex and sophisticated mathematics. Overviews, and extensive 
bibliographies, may be found in Brogliato [1] and Anh [1]. 

Extracts from Painlevé’s work, as well as discussion and references to the criti- 
cisms that followed—basically early attempts to create models for the problem— 
can also be found in the imposing work Hamel [1]; pp. 543 ff, 629 ff]. 


The noble game of billiards. In Chapter 6 we considered the theoretical case ofa 
perfect sphere rolling on a flat rigid surface, with friction playing the irrelevant, 
yet essential, role of insuring rolling. Of course, that picture would not be 
entirely correct even for a perfectly spherical bilhard ball, touching the table at 
a theoretical single point, because the weight of the ball causes the cloth of the 
table to compress slightly, and this produces friction, which would eventually 
causes a rolling ball to stop. However, this effect 1s very small, and we will 
ignore it. 

Friction does play an extremely important role in the game of billiards, but 
for quite a different reason: a billiard ball is often not rolling, but instead moving 
with a combination of spinning and sliding, with the speed of the center of the 
ball and the rate of rotation not being in the relation required for rolling, even 
if the ball does happen to be rotating about an axis. 

The figure below shows the simplest situation where at time t = 0 a bilhard 
ball of radius R has been hit straight on by the cue, with the contact point at 





height h above the billiard table. For convenience we set the mass of the ball to 
be m = 1, and choose the origin O of our coordinate system to be the center 
point of the ball at ¢ = 0, with the x-axis in the direction of the cue, the y-axis 
also parallel to the table (perpendicular to the plane of the paper in this figure), 
and the z-axis vertical, with the positive part pointing downward, so that it 
passes through the point of contact. ‘The strike of the cue, which is usually quite 


418 Chapter I1 


abrupt, can be thought of as imparting an impulsive force P that causes the 





center of the ball to move along the x-axis with velocity function v satisfying 
v(0) = |v(0)| = [P I, 


and, as in Chapter 6, we let vc be the velocity function of the contact point. 
At t = 0 the torque of the force P about the center of the ball is 


t (0) = (h — R)v(0) - (unit vector along the z-axis), 


so for the angle @ through which the ball has rotated around the y-axis, mea- 
sured clockwise, we have 


10'(0) = (h— R)v(0), 


where the moment of inertia J of the ball is 2 R? (Problem 5-6). It follows that 


5 (h-—R 
6'(0) = 3 ( = ) 


which implies that at ¢ = 0 the vector vc has magnitude 








7TR—5h 
vc (0) = v(0) — RO’(0) = ( aa ) v(0). 


For the ball to start out with a rolling motion we must have (compare page 240), 
vc (0) = 0, which means that the ball must be hit at exactly the height h = ZR, 
which just so happens to be the height of the cushions— presumably found from 
experience by the makers of billiard tables. 

For a “high shot”, with h > ZR, the direction of Vc will, as in (a), be opposite 
to that of v—the spinning of the ball, 0’, is larger than would be expected for 


Vo Vo 
(a) (b) 
rolling, given the speed v of the center. But now the non-zero velocity Vc gives 
rise to a force of friction F in the opposite direction (b), and this force acts on the 
rigid ball as a whole: it causes v to increase, and at the same time it causes 0’ 
to decrease until rolling begins, and the ball then simply continues to roll. 


F 


Friction, Friend and Foe 419 


For a “low shot’, with h < ZR, the situation is exactly the opposite: the 
direction of vc will be the same as that of v, with 0” less than expected, possibly 
even negative (with the ball spinning backwards). So the force of friction will 
be in the opposite direction as v, causing v to decrease, and 6’ to increase until 
rolling begins. 

If a player wants the ball to roll, it is only necessary to strike it with h close 
to ZR, and rolling will soon ensue. But sometimes high or low shots are specif- 
ically used to control the behavior of wer cue ball after it collides with an object 
ball at rest. In the case of a high shot (a), since the balls have the same mass, 


C90) CXF CIC 


and the enna is almost perfectly cn at the moment of aie (b) the 
velocities are interchanged, so that the cue ball now has velocity 0, and the 
object ball acquires the velocity v in the same direction. If the cue ball hasn’t 
reached the rolling stage, so that v- # 0, then (c) the friction force F causes the 
cue ball to move in the same direction, so that we have a follow shot. 

In the case of a low shot, the situation is again exactly the opposite: now Vc 
points in the same direction as v, so F points in the opposite direction, causing 
the cue ball to move backward, a draw shot. 

This analysis of the simple case of billiard balls rolling on a straight line can 
only whet the appetite of mathematically inclined billiard players, for it hardly 
touches on the intricacies encountered in the game. In fact, the 19" century saw 
the appearance of a thorough analysis of the dynamics of billiards, including 
the impressive shot shown below. As with all the studies previously mentioned, 
it was written by some one who is nowadays much better known for other work. 


Point scored in a game of 
three-cushion billiards 


Cueball drove first object ball 
along dotted path and followed 
solid path, contacting the long 
rail at point 1, curving back to 
the same rail at point 2, and 
continuing to point 3 on the 
end rail to score on the second 


object ball 





420 Chapter 11 


Biographical accounts of Coriolis (1792-1843) [basically from Ductonary of 
Scientific Biography] note that he was a student at the Ecole Polytechnique and 
worked for several years in the engineering corps, until his poor health led him 
in 1816 to accept a position as tutor at the Ecole Polytechnique; his life was 
from then on was dedicated to the teaching of science, and he eventually be- 
came a greatly admired Director of Studies. He felt that the results of mechanique 
rationelle (“rational mechanics’) should be used to give general principles appli- 
cable to the operation of machinery, and in his first book, Du calcul def lVeffet 
des machines (“On the Calculation of Mechanical Action’), he introduced the 
modern meaning of the term “work” in physics, as well as the proper factor : 
into the definition of kinetic energy for conservation of energy to hold. (His 
investigations of 1835 into the “Coriolis force” were made to account for con- 
servation of energy, and thus followed a much more complicated path than our 
modern purely kinematic approach.) 

This all sounds rather dutiful, if not a bit dreary, but Coriolis presumably had 
his diversions and times of relaxation—in the year 1835 he also published his 
second book: Theéone mathématique des effets du jeu du billiard (“Mathematical 
Theory of the Game of Billiards”) [though in his preface he dutifully says “I 
think that persons acquainted with rational mechanics, such as the students at 
the Ecole Polytechnique, will be interested in the explanation of the surprising 
effects observed in the motion of billiard balls.”]. By the way, in case you were 
wondering, the Coriolis force is not relevant to the game of billiards—unless, of 
course, you are playing on a rotating billiard table, an extra delight that Gilbert 
and Sullivan’s Mikado forgot to include when making the punishment fit the 
crime of being a billiard shark. 

Until its reprinting as Coriolis [1], this book was extremely hard to find, even 
if one was up to reading the French. So I was delighted to learn from a friend 
of old that he had made an English translation, Coriolis [2], which presents 
Coriolis’ theory, together with accounts of the careful experiments he made to 
determine coefficients of friction. 

Though Coriolis’ exposition is quite straightforward, modern terminology 
helps the exposition. ‘The material presented here, covering just a bit of Coriolis’ 
work, has appeared in various classical works on mechanics, and can be found 
in notes available on the web.! Having these notes at hand may help smooth 
the study of the rest of Coriolis’ book—for those readers in pursuit of the lost 
time of their misspent youth. 


| They may be found at the web site http://billiards.colostate.edu; click on 
“Technical Proof (TP) analyses’, and scroll down to the TP A section; here we are 
covering material from TP A.4. 


Friction, Friend and Foe 421 


We first study the path of the cue ball when it is not necessarily hit head on, but 
possibly to the left or right of center (with sede english). Coriolis considers both 
the sliding friction and the very small rolling friction for the initial part of his 
analysis, and then ignores the rolling friction later on, but we will simplify things 
by ignoring it right from the start, which doesn’t actually affect the outcome. 

We will use the same coordinate system as before, letting e; be a unit vector 
pointing along the x-axis, with e2 pointing along the y-axis, and e3 point- 
ing (downward) along the z-axis. We now use the more general equation on 
page 190, 

2 


(1) t=I1@'= =R*o!, 


with the velocity vc of the contact point given by 
(2) V=Vv+4+ @x Rez. 


It will be convenient to let u(t) denote the unit vector in the direction of vc (t), 
and write the frictional force at the contact point as 


F = —-yu 


(where yz really denotes the product of the coefficient of friction, the mass m, 
which we’ve taken to be 1, and the force of gravity g, to get the weight). This 
means that the acceleration v’ of the ball is 


(3) vo = —WWU. 
The torque t of the force F about the center of the ball is 
t = Re3x F = —pR(e3 X wu), 


so equation (1) gives 


2) 
(4) @' = sae xX U. 


Differentiating (2), and substituting from (3) and (4), we have 


/ 


5 
Vo =—pu — =p (€3 xu) x Re3 
or simply 


7 
(5) VC’ = - =u. 


Notice that vc’ is always in the direction of vc, since u is, by definition. But 
this umplies that vc does not change direction, and thus that u is a constant vector. 


422 Chapter I1 


So from (3) we can write 
(6) v(t) = v(0) — peu, 
and thus the (center of the) ball follows a path c with 

c(t) = c(0) + tv(0) — Sur’, 
which is a parabola. From (2) we have the explicit formula 
(7) pee lac ae ie iyel 


Uc (0) 


This all holds, of course, only while the ball is not rolling—as we found in 
Chapter 6, as soon as it starts rolling, it will continue on a straight path, along 
the tangent line to the parabola. Since (5) gives 


Tt 
Ve (t) = ve(0) — —u, 


and rolling starts at the time ¢, where we first have vc (t+) = 0, we see that 
ee 2vc (0) 

x TL ° 

So the velocity v(t.) when the ball starts rolling is, from (6) and (7), 


2uc (0) [| v(0) + w(0) x Re3 
f Uc (0) | 





V(tx) = v(O) — p (==) u = v(0) — 
7p 
or 
(8) V(tx) = (0) — -00(0) x Re3. 
We can write this result explicitly as 
V(tx) = - [ (v1 (0) + 2Rw>2(0))e1 1 (Sv2(0) oe 2Rw 1(0))e2| , 


For the final deflected angle 6, which it will be convenient to measure from the 
y-axis rather than from the x-axis, we then have 


(a + ee 
(9) 9 = arctan | ———_—"— ] . 
5v2(0) 7 2R@,(0) 


Note that the quantity yz has disappeared from these equations, so the coefficient 
of friction of the ball on the cloth has no effect on this final result, even though 
the path itself would change. 


Friction, Friend and Foe 423 


Now suppose instead that the cue ball is hit straight on, without side english, 
along the y-axis to collide (a) at an angle with a stationary object ball. The 
collision will give the cue ball a spin and we want to consider the path of the 
cue ball after collision (b), and the deflected angle 6. Problem 6-4 shows that 






impact line 


ball = = & cut angle 
(a) (b) (c) 


the velocity of the cue ball after the (perfectly elastic) collision (c) will be per- 
pendicular to the “impact line” (the line perpendicular to the two balls at the 
point of contact), and that if the velocity of the cue ball is v at impact, then the 
velocity after impact will have magnitude vsin ¢, where @ is the “cut angle” 
between the original direction of the cue ball and the impact line. From the 
diagram, we see that this initial velocity, after the impact, 1s 


v(0) = (vcos¢sin @, vsin’ ¢, 0). 





We are assuming that at impact, which we will consider as time t = 0, the 
cue ball has #2(0) = 0. Setting w = w1(0), equation (9) becomes 


( 5v1(0) ) ( 5u sin @ cos d ) 
G = arciah, (a arctan = 
Sv2(0) — 2Rw 5u sin’ @ — 2Rw 


In particular, suppose that the cue ball is rolling at the time of impact. Then 
w = —v/R, so we get 


40° 
ae $ 30° 
sin @ cos 
PS arctan | sea ae O 20° 
sin” @ + & 
10° 


22.5° 45° 67.5° 90° 
p 


424 Chapter 11 


The graph of @ as a function of ¢@ shows that for collisions that are neither 
too close to head on, nor too glancing, @ is about 30°, giving the “30 degree 
law” for estimating the direction in which the cue ball will bounce. You can 
find videos on the web explaining this rule, with discussion and computations 
at http://billiards.colostate. edu. 

Apparently, billiard and pool players actually make use of rules like these, 
though I don’t know if such crutches are ever needed by those who can execute 
shots like that shown on page 419. ‘This shot is analyzed in Coriolis’ book, 
whose preface contains the acknowledgment: “Monsicur de Tholozé, governor 
of the Ecole Polytechnique, graciously showed me several complicated shots, 
which the theory afterwards explained: it is he whom I saw make the shot 
diagrammed ... which [is] explained in the course of the work.” 


Have fun, and good luck, reading the book! 


The Jellett invariant. In Chapter 9 we analyzed the “heavy symmetrical top”, 
corresponding to a typical toy top with a pointed end, which we assumed was 
fixed. By contrast, the last chapter of Jellett [1], one of the pioneering works in 
the theory of friction, analyzes a top with a non-pointed end. Jellett basically 
restricted his considerations to the case where the bottom was shaped like a part 
of a ball, and it is somewhat more convenient to consider a body that simply 
is in the shape of a ball, as in (a), with center O and radius R, with a density 
that is not necessarily uniform, though it 1s symmetric about an axis, so that the 





(a) 


center of mass G lies on this axis, at a distance a from O. The ball is thus a 
symmetric top with principal moments of inertia [1, 11, /3. 

In (b), showing the body after it has moved along the supporting surface, now 
at contact with the surface at C, we have introduced X, Y, and Z-axes along 
the principal directions (with the Y-axis pomting into the plane of the paper in 
this figure), as well as the Euler angle 9 and the unit vectors ez and e, along 
the Z-axis and the unrotating z-axis. Here we are allowing our (x, y, z)-plane 
to move so that G is always at the origin, as mentioned on page 344. 

It will also be convenient to allow the X and Y axes to rotate within the body, 
as anticipated in Problem 9-8, and we choose X to lie in the plane through the 


Friction, Friend and Foe 425 


Z-axis that contains the contact point C, as in the figure, so @ measures the 
angle by which this plane rotates. ‘The line of nodes, the intersection of the 
(x, y)-plane and the (X,Y )-plane, has the Y-axis lying in it (c), perpendicular 
to the (X, Z)-plane, so the angle w needed to bring the line of nodes to X 1s 
just yw = —2/2. Thus our cquations (w) on page 344 simplify to 


(ws) w, = ¢' sind, Ww. = —60’, w3 = ¢' cos6, 


which one could also easily see directly. 
The figure also shows the vector r from the contact point C to G. Note that 
since the distance from G to O is a, we have r + aez = Re; or 


(1) r = Re, —aez, 
while the angular momentum L about G can be written as 
(11) L = I,a@,ex + ha2ey + 130@3ez = I;m + (3 — 11)@3ez. 


If F is the total force applied to the ball at C, including the upward force of 
the surface on which the sphere is moving that balances out the downward 
gravitational force on the ball, as well as the force of friction, about which we 
will make no assumptions at all, then the torque t about the center of mass is 
simply t = —r x F, so we have 


(iii) L =e =r Xf. 
Now (i) gives r’ = —a -@ X eZ, So (ii) gives 


(L,r’) = (lym + U3 — I1)w3ez, —a-@ x ez) 
0, 


since @ X €z 1s perpendicular to both w and ez. On the other hand, from (i) 
we get 


(L',r) = 0, 
and from these last two equations we conclude that (L,r)’ = 0, and thus that 
(1) (L,r) = constant. 
Resubstituting r = Re, — aez into (1), and dividing by R, we get 
(2) (L,e,) —e(L,ez) = J, where € = a/R 


for a constant J (Jellett’s constant). 


426 Chapter I1 


In either form, this may seem like a pretty elementary conclusion, but it is 
noteworthy because it holds no matter what is assumed about the frictional 
forces or motion of the top, so long as it remains in contact with the plane. 

We can also make the result look a lot less elementary by writing it in terms 
of the Euler angles! Substituting 


(L,e,) = Lysn 0 + L3cos@ = Ia, sin 8 + 13@3 cos 8 
into (2), we obtain 
I, sin 6 -@, + 13(cos 6 — €)w3 = J 
and then by (as) 
(J) I, sin? 6 - ’ + I3(cos 6 — €)w3 = J, 


which is basically the form in Jellett [1; pg. 185], who used the same special sort 
of coordinates as used here. 

This result is called Jellett’s zntegral, where “integral” is here being used in the 
same sense that conservation of momentum 1s called an integral of the equations 
of motion. This 19" century discovery was long ignored and almost completely 


th 


forgotten, until rescued from oblivion by a 20 century toy. 


Tippe Tops and hard boiled eggs. ‘The Tippe ‘Top was invented, or perhaps 
merely reinvented, in 1950 by a Danish engineer Werner Wstberg, who reported 
that on a visit to South America he saw people playing with a small fruit, which 
when spun by its stem would turn over and end up spinning on the stem. It 
appears that a typical college ring exhibits sumilar behavior—when spun with 
the heavy stone at the bottom, it turns over and spins with the stone on top. If 
you don’t have a college ring, the toys can still be bought, quite inexpensively, 
and are usually in the shape of a sphere with a small section capped off, with a 


(a) (b) 


stem attached. In (a), at the beginning of the spin, the center of mass G of the 
Tippe ‘Top lies slightly below the center O of the original sphere, while in (b), 
at the end of the spin, it ends up above it. 


Friction, Friend and Foe 427 


The ‘Tippe Top seems to have fascinated physicists as well as kids. An oft- 
mentioned photograph from 1951 shows Wolfgang Pauli and Niels Bohr bent 
down peering at a Tippe Top in action,! and in 1952 and 1953 the basic princi- 
ples were explained in several papers, followed by numerous later papers, some 
quite recent. 

The first thing one immediately notices about the Tippe Top 1s that its poten- 
tial energy increases when it turns over, since the center of mass ends up higher. 
A second fact is not so immediately apparent. If one takes a toy top of the sort 
mentioned on page 349, spinning around a protruding axis that can be held in 
the hand, and then turns the top over (a), it is naturally now spinning in the 
other direction. The angular momentum has reversed sign, thereby changing 


09 © © 


by a fairly large amount; this requires a torque, which is why it’s harder to invert 
the top when it is spinning, On the other hand, as observation confirms, when 
the ‘Tippe Top turns itself over (b), it ends up spinning in the same direction as 
when it started, so the spin reverses direction within the body coordinates. 

Nevertheless, since the angular momentum determines the kinetic energy, 
and since the potential energy increases slightly when the top turns over, the 
angular momentum must decrease slightly when it turns over. So a small torque 
is stull required to produce this change of angular momentum. And just where 
does this torque come from? ‘The answer 1s, it comes from friction. Here is a 
rough, approximate, description of the process. 

‘To begin with, when the top is at an angle, and rotating about the vertical 
axis through its center of mass G, the contact point C, lying below O, is moving 


L 
Z ie 


Vv 


' Tt can be found by googling pauli picture. 


428 Chapter I1 


in a small curve, whose velocity vector is a horizontal vector v nearly perpen- 
dicular to the plane of the diagram. ‘Thus the contact point is sliding, giving rise 
to a frictional force f in the opposite direction (pointing into the plane of the 
diagram), and the torque t of f around G 1s nearly horizontal, since O and G 
are very close. This torque rotates around with the same angular velocity w as 
the top, and averages out to 0, so L’ ~ 0 on the average, and angular momen- 
tum is nearly conserved, as we basically already noted when we mentioned that 
the top ends up spinning in the same direction. ‘Thus we have approximately 
L=@0-e,. 

The principal moments of inertia /; and /3 for a Tippe ‘Top are usually close 
in value, so for a very rough seat-of-the-pants approximation! we will assume 
that they are actually equal. As shown in Problem 9-7, this implies that we then 
also have t = L’ zn body coordinates. For the unit vector ez along the Z-axis, 
which is constant in body coordinates, we thus have 


(ez,t) = (ez, L’) = (ez,L)’. 
But 


(ez,T) = —|t| sin 8, 
(ez, L)! = (|L|cos@)’ = —|L|siné - 6’, 


giving 

0° = |t|/|L] = wWR/To, 
where yp 1s the coefficient of friction, W the weight of the top, and R the radius 
of the original sphere. Thus @ keeps increasing, until the stem touches the plane 
(at some time < 2/0"). From then on a similar argument about friction on the 
stem shows that the top will start to rise. 


The early Tippe Top papers, which are referenced in some later papers that 
we will mention, gave considerably more detailed analyses, although they all 
used approximations of some sort, depending on the particular aspect of the 
top’s behavior that they were focused on. One detail that we want to investigate 
is the process by which the top ends up revolving in the same direction.” 


| Barger and Olsson [1]. 
* Pliskin [1]. 


Friction, Friend and Foe 429 


Until the stem touches the plane, the center of mass G of the top is practically 
stationary, and we will consider it as the fixed point for the Euler equations, 
remembering, however, that we chose X and Y in a special way, so that they 
are not fixed in the body. If @ is the angular velocity of the contact point C 
around the Z-axis, then the second Euler equation in Problem 9-8 becomes 


T2 = 12’ + (11 — 13)@103 — [300). 


Pretending, as before, that the force of friction 1s exactly perpendicular to the 
plane of the diagram on page 428, the only torque about G along the Y-axis 
will be due to the upward force of magnitude W acting at C, at a perpendicular 
distance asin 0 from G, and we find that t2 = aW sin @. So we have 


aW sind = 1,0” + (1) — 13)@10@3 — 1300. 
Once again we approximate L = w- ez, which implies that 
@,; = asin 8, wW3 = wceosh, 
so that our equation becomes 
aW sin@ = 1,0” + (1 — 13)w” sin 6 cos 6 — naw sin 6. 


Now the Tippe Top’s behavior indicates that 6’ is small compared to @ (the top 
is set spinning at many revolutions per second, while it takes several seconds for 
the top to turn over). It seems reasonable to conclude that 6” is also small (at 
least on average). If we therefore simply eliminate the /,0” term, we end up 
with 
a= oe “ Oe cos 0. 
I 3WM I 3 
Thus a = 0 for 9 satisfying 


aw 


COS Oo = ody 


and a changes sign at 69. When @ is very large, 9 is close to 2/2, with the 
stem nearly horizontal at the time that the spin reverses. If a ‘Tippe ‘Iop is set 
spinning on carbon paper, it gives a carbon trace of the path of C along the 
‘Tippe Lop that illustrates these facts. 


ap 


430 Chapter I1 


These rough approximations illustrate the general tenor of the discussion 
of the early Tippe ‘Top papers. Then in 1974 Cohen [I], inspired after idly 
spinning his college ring, simply wrote down the complete set of equations and 
found computer-generated solutions for “these horribly nonlinear differential 
equations” without approximations, though he did work with a sphere, as in 
the analysis of Jellett’s integral, the results then being valid for a ‘Tippe Top 
up to the point where the stem touches the surface. They attest both to the 
general correctness of the simplifying assumptions, and the extent to which they 
oversimplify things. For example, the graph of 6 over time looked something 
like the figure below. 


@ (radians) 





0 aD 1.0 1.5 2.0 


t (seconds) 


The next stage of the Tiyppe ‘Top investigations involved the rediscovery of 
Jellett’s integral. ‘The classical form of Jellett’s integral was first rediscovered 
in an early Tippe ‘lop paper by Hugenholtz [1], where it was used to discuss 
certain aspects of the ‘Tippe ‘Top’s motion. 

But Jellett’s integral was given a decisive role in a paper by Leutwyler [1] in 
1993, which was “written to provide entertainment’, with the author being un- 
encumbered by familiarity with the historical literature. Leutwyler once again 
rediscovered Jellett’s integral, now in the form (2) on page 425, remarking that 
this integral “seems to have escaped attention”, but serendipitously denoting 
the constant by J. 

Leutwyler actually discovered only a special case of Jellett’s integral, and his 
demonstration was rather strange, but the importance of his paper was the way 
in which he used Jellett’s constant. He determined when the total energy E 
has the lowest possible value for a fixed value of J, the idea being that sliding 
friction wil cause the top to loose energy, so it should end up in a position 
with EF a minimum. 


Friction, Friend and Foe 431 


Before presenting Leutwyler’s argument, we might point out that Jellett him- 
self actually proved his integral only as an approximation, making simplifying 
assumptions about friction. ‘The general result was then proved in the classic 
(distressingly difficult to read) book Routh [1; Art. 241c], where it was used to 
analyze the completely solvable case of a top with a spherical bottom rolling 
without slipping on a plane, an investigation that was completely ignored, as 
pointed out in 1999 by Gray and Nickel [1], which 1s basically a rewriting of 
Routh’s results in modern terminology, together with an extensive bibliography 
of older papers. Our proof of Jellett’s integral comes from this paper, which 
also reintroduces the much more complicated, much less well known, Routh 
integral. 


To derive Leutwyler’s result, leaving some of the details to his paper, we want 
to work with an ordinary set of principal directions fixed in the top, just like 
those used in Chapter 9. We find that now 


J = $'(; sin? 6 + 13 cos 0(cos 6 — €)) + w'13(cos 8 — 8). 


Similarly, using the equation for the rotational energy T on page 346, we write 
the total energy, for a top of mass M, as 


M 
E= Bee ke es ah gi) 


] I 
+ 50? + 9” sin” 8) + 5G! cos + w')? + Maz, 


where we have z = R(1 —cos@). 

Since J doesn’t involve x’, y’, or 0’, the minimum of EF fora fixed J obviously 
occurs when 0 = x’ = y’ = 0’, so x, y, and @ are constant, which also implies 
that z = R( — cos@) is constant; thus the center of mass and the angle of 
rotation are fixed at a minimum for E. 

Now for each fixed @ we look for the minimum of E for a fixed value of J. 
Computations show (cf. Problem 4) that this occurs when y’ + ed’ = 0, with 
the energy being 


2 


J 
= —— + MegR(1- 6 
208) * gR(U — e€cos8@), 


E(@) 
where 
4(8) = I3(cos 6 — &)? + 1; sin? 6. 


When the top spins rapidly, so that ¢’ and hence J 1s large, the minima of the 
E(@) are the maxima of the &(@). We have &(0) < &(z), so that, as expected, 


432 Chapter I1 


the top has less energy spinning upside down then in its initial condition. In 
fact, it turns out that for 


(1 — €)13 < I < | + e)I3 


the function &(@) is increasing, so that the top moves into the inverted position. 
If instead 
UF é)is-< 7}, 


then &£(6) has a minimum close to 


el 3 
cos 9 = — ———_.,, 
I,-k 
and the top ends up rolling around a vertical axis through the center of mass 
with the contact point moving in a circle of radius a sin 6. Finally, if 


I< (I — e)Is, 


then both 6 = w and 6 = 0 are local minima, so the initial position is stable. 

Leutwyler’s rediscovery of Jellet’s integral was followed by several other papers 
with particular emphasis on questions of stability, using the integral to reduce 
the number of equations required for a complete set of equations for the Tippe 
Top. The latest I know of is Rauch-Wojciechowski, Skéldstam, and Glad [1], 
which bills itself as “a rigorous, and possibly complete analysis of ... the tippe 
top ... ”, and has references to the important earlier investigations. 


We'll end this long, though incomplete, discussion with a more mundane, and 
some might even say, frivolous, phenomenon. If the college ring is the well-to-do 
person’s Tippe ‘Top, then the poor person’s ‘Tippe ‘Top might very well be the 
hard boiled egg. When a hard boiled egg is placed on a surface and set spin- 
ning it will (as experience has shown me) quickly stand up on its long axis and 


continue spinning—quite a bit more rapidly, because the moment of inertia 
around the long axis is less than that around the short axis. A spinning football 
is supposed to behave similarly. 


Friction, Frnend and Foe 433 


One of the first explanations of the spinning egg behavior was given in Moffatt 
and Shimomura [1], and a very thorough investigation of the phenomenon was 
subsequently made by Bou-Rabee, Marsden, and Romero [1]. Both papers rely 
on the fact that although the Jellett invariant 1s defined only for a top with a 
spherical bottom, it is an “adiabatic invariant” for more general shapes. We 
will talk about adiabatic invariants in Chapter 22, but we won’t be returning to 
the consideration of the spinning egg; I fear that if I tried to explain any details 
of these papers, P’'d probably just end up with egg all over my face. 


And yes, there does seem to be something about eggs that encourages frivolity. 
The paper by Bou-Rabee, Marsden, and Romero is entitled “A geometric 
treatment of Jellett’s egg’, while the article by Moffitt and Shimomura, which 
appeared in the March 28, 2002 issue of Nature, is announced with the banner: 
An explanation for an odd egg performance is rolled out in time for Easter. 


434 Chapter I1 


PROBLEMS 


1. Consider a plane that can be slanted at various angles a to the floor, and 
an object placed on it. Show that the condition that the object doesn’t move is 


a 


tana < zs. The “angle of friction” @ at which the object just starts to move 
thus gives ws = tan@ (while yp itself can be calculated by observing the speed 
of descent for a > @). 


2. Suppose a smooth cane or walking stick, or even a wooden ruler, of mass m, 
is balanced on two fingers, at different distances a and b from the midpoint. 


a b 


= 


A B 


We let yz be the coefficient of kinetic friction for wood on fingers, and jzs the 
coefficient of static friction. Although we only stated that w < ws, in most cases 
we have ju < js; this is certainly the case for wood on fingers. 


(a) ‘he forces on A and B are 


b 
mg, Fz = 7 mg. 


F 2 
ss a+b 


ath 








Assuming b > a, as in the figure, so that F4 > Fg, conclude that as the fingers 
are moved toward each other, A stays at the same position on the ruler while B 
moves closer until it reaches a distance 5; < a where the sliding friction of B 
equals the static friction of A, and thus 

a Ls 


pa = shy, SS ei, 
by ja 


At this point, A starts moving until it reaches a; with 


5) _ Us 


b 
a} a 


so A and B approach each other in geometric progression, ending (theoretically 
after an infinite number of steps, though in practice after only a few) at the 
midpoint. 


Friction, Friend and Foe 435 


3. Consider the situation in Problem 6-18, except that there is a non-zero coef- 
ficient jz of friction between the filament and the fixed object, and suppose that 
the filament starts slipping in the direction from c(s) to c(s +h). Then there is 
an additional force at c(s) in the opposite direction with a magnitude close to 
p|N|h. Conclude that we now have 


T'(s)=ph|N|, [NJ = TO 


and thus ie. 
7 ee T (0) = Toe”®. 


This exponential increase explains the effectiveness of a capstan, where a rope 


To 
T 
wrapped around a post several times allows a very small force to keep a very 


large one at bay. For example, if the coefficient of friction between the rope and 
the post 1s 5, and the rope 1s wrapped twice around the post, then 


Pe he = Toe" 
for a rope beginning to slip in the direction of 7. Since 
To 
T 
a load of 2000 pounds can be kept from slipping by a force of about 3.8 pounds. 


=e 7 = 0019, 


4. Find the minimum of 
E(®, WV) = sin? 6 -@? + 2 (@ cos 6 +W)* + Mez (b= ¢', V= yp’) 
for a fixed value of 
J(®, V) = (J; sin? 6 + 13(cos 8(cos 8 — €))- ® + I3(cos 6 — €)W 


by expressing F in terms of ® (and the constant J) alone. Or, since we’ve 
already given the answer, verify it by using the fact (Problem 5-2) that at the 
desired (®, VY) there will be a A for which the partial derivatives satisfy 

Eo =AJo 

Ey = AJy, 


noting that for Y = —e®, both hold, with A = ®. 


PART UI 


LAGRANGIAN 
MECHANICS 


CHAPTER 12 
ANALYTICAL MECHANICS 


We now come to a piece of work which united and crowned 
all the efforts which were made in the XVIIIth Century to 
develop a rationally organised mechanics. Mécanique analytique 
appcarcd in 1788. ... [Lagrange] became preoccupicd with the 
organisation of mechanics ...the perfection of its mathematical 
language and the isolation of a general analytical method for 


solving its problems. 
eae — Dugas, A Aistory of Mechanics 


pas I and II have covered material that would generally be regarded as 
“elementary” mechanics. Although we have allowed ourselves the luxury 
of using whatever mathematical techniques we needed, we have basically always 
reduced everything to derivations directly from Newton’s laws. 

Lagrangian mechanics, the first encounter with “advanced” mechanics, is also 
called analytical mechanics, as it sedulously shuns geometric arguments, and no- 
tably deals with scalar functions in preference to vectors, which so often require 
diagrams to understand and facilitate computations. ‘The preface to Lagrange’s 
famous Mécanique Analytique, Lagrange [1], notoriously declared “No figures 
will be found in this work.” 

Mecanique Analytique is now available in an English translation (the fourth edi- 
tion of 1811), but despite the encomium offered by Dugas to the “perfection of its 
mathematical language’, it is a mathematical language now quite foreign to us. 
So we will take advantage of modern notation on differentiable manifolds, ex- 
pressly designed to elucidate the implicit conventions of this classical language, 
which mathematicians of the time grasped intuitively (or just mimicked). 


The mathematical arena for analytical mechanics. We are going to be work- 
ing with a differentiable manifold M together with the tangent bundle TM, 
with M, the tangent space at p € M (we use something like g for poimts 
in M, reserving p for later use, cf. page 445, and especially in Part IV.) We 
wil use the notation and conventions of DG, especially Volume 1. In keeping 
with the conventions of mechanics books, we will use g for a typical coordi- 
nate system on M, though writing it, as usual in differential geometry, with 
superscripts (q',...,q”). Recall (DG, Chap. 3) that if 7: TM —> M is the 
projection, every tangent vector v € My, at a point p € M can be written 


, and if we define g'(v) = a’, then on TM we 
Rp 





nt 
. 0 
uniquely as V = ya ag? 
i=1 


439 


440 Chapter 12 


have the coordinate system (q!oz,...,g"o7,q',...,q"), often written simply 
as (q!,...,qg",q',...,g”) for convenience. And in general, for a function f 
on M we often use f as a shorthand for the map f oz on TM. 

Also recall that for a smooth curve c: R — M, we have the tangent vector 
c(t) € Meqy for each ¢, so that c’ is a curve c’: R > TM. 


Since (q',...,q”",q',...,q”) is a coordinate system on TM, for any differ- 


; L L 
entiable function L on TM the expressions oat and aa make sense; formally, 
q q 
we just write down a formula for L in terms of the g’ and gq’, and then simply 
regard them as new “variables”. 
Specialized considerations for analytical mechanics. Suppose M is a subman- 
ifold of an N-dimensional manifold with the coordinate system (x!,...,x”). 
The N-dimensional manifold might be M itself, so that we are comparing two 
coordinate systems on M, but the case of most interest will actually be M C RY. 
For points @ in the domain of a coordinate system gq on M, we have 


x*(p) = X°(q'(p),.-.,.9"(R)), w= 1,...,N 
for certain smooth functions X*%: R” — R, or simply 
xo SX eg" 525359"). 
On TM we also have the functions x* and g!,...,q", related by a similar 
equation, involving other functions, say 
x% = X% o(qg!,...,g", g',..., 4"). 


To figure out exactly what X* is, we consider a curve c in M with c(0)=p 
and let v = c’(0) € Mz. Differentiating the equation 


x*(e(t)) = X*(q'(c@)),-..,4"(c@)) 


we obtain, using Dj; as the partial with respect to the i argument, 


(x* oc)'(t) = D> DiX*(q'(cC@),.-..a"(e) -@' oe), 
i=] 


and evaluating at ¢ = 0 gives 


n 
X*(v) = Yo DiX*(q"(p),---. 9" (P)) -F(), 
i=l 
which means that, shorn of extraneous symbols, we have 
n 
X%(q),...,a", b1 bb") = Di DENG rscicG) .b'. 
i=] 


In particular, we have 
Dn+iX@(a',...,a", b!...,b") = D;X*%(a',...,a"). 


Analytical Mechanics 44] 


As indicated previously, we will allow q’ to stand for q’ o x on TM, and 





) 
- and —— 
aq) an aq) are defined, 


we see from the final equation on the previous page that on 7M we now have 
the meaningful, and true, equations 


similarly for x*. Remembering just how the symbols 


ox*  dax® 
(a) ag? AgF 
dq’ 0g 
and we also easily derive 
One see ex 


eae een) 


Lagrange’s equations. Now let us recall, from page 210, d’Alembert’s Principle 
for Constraints: 


If the constraints on a system confine the system to a configuration 
space M, and are perpendicular to M, then for all t the motions of the 
system under the external forces F satisfy 


(x) (F(c(t)) —me’(t), v) =0  forallv € Mew, 


where ( , ) denotes the usual inner product on RY. 


In Chapter 6, we struggled to use this principle by expressing all vectors 
in terms of a set of coordinates for M. Now we will take a totally different 
approach, writing (*) directly in terms of real-valued functions in an arbitrary 
coordinate system gq on M. 

Let x!,...x% be the standard coordinates in R% , where N = 3K fora system 
of K particles, and the x® are naturally grouped in triplets, though it is often 
easiest to think of a curve c in R% rather than a collection of K curves; for me, 
which denotes (m1¢1,...,™ncn), the mg are naturally equal in triplets. 

Then d’Alembert’s principle can be written as 





N 
0= > (Fa(ctt)) —maca”() -dx*(c(t) on Mee) 
a=1 
" axe i 
= > (Felc(t)) = maca"(t)) - = dai(c(t)) on Mew, 
aq 
re eee 0) 
i) Pere n 
1 Classically obtained from the “equation” ew ond (7) q”) by first computing that 


x* = \?_,(dx%/0q')q', and then, equally nonchalantly, that dx%/dg' = dx%/dq'. 


442 Chapter 12 


and since the dq’ are linearly independent on each M,q), this is equivalent to 
the set of m individual equations 


N dx! 
a (Fa(c(t)) — maca" (t)) agr (C(O) = 0, 2 eee 


a=1 
For the case of conservative forces, where Fg = —dV/0x%, so that 


N 
yo Fe 
a=] 


Pio oe 
dgi = Ox 0G? agi? 








we can thus write 


OV 
dg! 








N je 
(+) Do mace" (15 Fe) = 37 elt). 


a=1 


Now comes the fun part. 
ca () (c(t) 
q 
Gf > pu ON pee f Ox. 
= F(z) ~ ee ( Fle) 


which, using (a) on the first term, 


df, 
= dt («. (t) 





ax” / / : ax” / 
sare) ~ 6 OD agar OMe! 90 








d ee ox” / “ay! 7 axe : 7 / 
nie (: (c (t)) aq! (c ©) SN (e Od, agiagh CMa (c'(t)) 
= £ (ems ')) ~ MC'O GF C'O) by (b) 





= & & (s18°'F)) . a (414%). 


When we add up these equations for all a, equation (**) becomes 





(ee & (gar @')) ~ 57) =F) 


where TJ is the kinetic energy, 


hee ee 
T(c'(t)) = _ Yo malx*(c'(t))?, or simply T(v) = 5 > malk*(v)P. 
a=1 


a=1 


Analytical Mechanics 443 


In the standard “condensed notation” of physics, where the curve c is sup- 
pressed, we can write (***) as 


d [( 0T 0 rane ae 
dt a) Ogi 
Finally, since V on TM is really V ox for V on M, we have dV/dg' = 0, so we 
can write 

d (oT —-V) 0 ee ae 

dt Pa i ae 


0. 


0. 


Introducing the Lagrangian 
L=T-V 


y) 


we thus have! 


d (sa) OL 
Lagrange’s equations: —-~— =0 


dt 





og! gq’ 


Using Lagrange’s equations. ‘To get an idea of what these convoluted calcula- 
tions come down to, we first consider a simple case that has nothing to do with 
constraints, instead illustrating the usefulness of the fact that our equations can 
be written in terms of any convenient coordinate system. We consider motion 
in a plane under a central force, with the obvious choice of polar coordinates 
(r, 0). As usual, we also allow r and @ to be used as abbreviations for r oc and 
doc [as a consequence, 7 also denotes (roc)’, which under the same conventions 
we would denote by r’]. As noted on page 120, we have 


T(r,0,7,0) = 4m(F? + r?67) 


[in our current notation, we could deduce this by computing 


x =rcosé x =fcosO—rsinO-6 
y=rsn@ y=Fsind+rcosd-8, 
and substituting into the formula T = Sm(x? + y) for standard coordinates], 
while we wrote the potential V as 
V(r, 0) = Fr), 


for F’ = f, where —/f(r) is the force at distance r from the origin. Thus the 
Lagrangian L is 


L=T—-V =1m(?? +776?) — F(r). 


2 
| For non-conservative forces, we have a7(307) — ui = Q; for the “generalized forces” 





N ax 
QO; —— =| Fy ° agi : 


444 Chapter 12 


Now for . 
| im(r7 + r*67) — F(r), 


Lagrange’s equations 
d (OL OL 0 d (oL 7] 0 
dt \ or dr” dt \ 96 060” 


d : 19 : _ ad 2A: 2 
7, unt) — (mre — f(r)) =0, 4, var 0) =0 


become 


m(# —r62) + f(r) =0. 
[The simplest approach to these calculations, and really the whole point of 
the procedure, is Just to forget about what anything means, and simply regard 
rr, 0,0 as “independent variables” and work formally—until we have to take 
the derivative d/dt, at which point it behooves us to remember that 7 means 
(r oc)’, and * means (roc)”.] 
We've left the second equation undifferentiated, because we immediately get 


r?6=h for aconstant h, 


the basic equation at the bottom of page 120, which we previously had obtained 
as a separate step in the analysis. When we then substitute this into the first 
equation we obtain fir) Ph? 





m r3” 
which is equation (B) on page 121; to obtain equation (A) on that page, from 
which we originally derived (B), we would have to integrate once. 

This stmple example already illustrates the basic attractive feature of this 
approach. We just write down the equations, manipulate them formally, and 
see what they end up saying, thereby obtaining a set of second order equations 
that will determine the functions (the kinetic energy T plays such a central role 
precisely because its first derivatives give us the second derivatives of the q’). 

One downside is the fact that the second order equations obtained may be 
more difficult to work with than those obtained in a more elementary way, as 
illustrated by the fact that we obtained (B) on page 121 instead of (A)—and, 


similarly, it would have been foolish to expand out = (776) = 0. 


In addition, the geometric significance of equations like d/dt(r?6) = 0 may 
by somewhat obscured. But then again, we don’t need the geometric arguments 
to obtain them. Instead, we have an automatic way of obtaining such geometric 
relations. We ended up with the equation d/dt (r76) = 0 precisely because L 


involves only 6. not @ itself. In general, if q' does not appear in L, but only q* 


Analytical Mechanics 445 


the coordinate q’ is called zgnorable or cyclic [like 9 in polar coordinates]. In this 


d foL 
case, the Lagrange equation will be oe ( agi 


In particular, consider the heavy top of Chapter 9, with kinetic energy T given 
by the second equation on page 346, except that now we also expand out 3: 





OL 
: = 0 and hence —— = constant. 
dq! 


Ih |: ; I3.. ; 
(ie ae + 67 sin? 0) + +@ cos6 + w)?, while V = Mel cos@. 
We note that both y and ¢ are cyclic, so that along a solution c 


OL OL OL 
— and —~ are constant, i.e., —-(c’(t)) and —~(c’(t)) are constant. 
op ow Op 


The first gives equation (B) on page 346, the second equation (QC), fulfilling our 
promise of obtaining these equations without worrying about what they mean. 


When q’ are standard rectangular coordinates, we have 


OL OT 

dg 0g! 
and in general p; = 0L/0q' is called the (generalized) conjugate momentum to q' 
(so for the heavy top, we’ve seen that the conjugate momenta pg and py are 
constants). Classically, momentum mv was written as p, which is what suggested 
the notation p;, and the accompanying q' for the coordinates. 


IG. a component of the momentum mv, 


The fact that Lagrange’s equations can be written using any convenient co- 
ordinate system brings up an interesting point. Although we were led to these 
equations from the perspective of mechanics, one can consider an arbitrary 
“Lagrangian” function L: TM — R and the Lagrange equations for curves 
c : R — M, and it turns out that in this general case the equations are again 
“invariant” —if they hold in one coordinate system, they hold in any other. We 
will prove this by a method that avoids the unpleasantness of a messy compu- 
tation in Chapter 13, and in a much better way in Part IV. 


Constraint problems. As an illustration of the use of Lagrange’s equations for 
a constraint problem, we consider the example from Chapter 6 of a block of 
mass m sliding on a wedge of mass M, as in the figure below (face Lagrange). 


a lm 


Xx 


We use the same obvious coordinate system (x,s5) as before, where x is the 
position of one end of the wedge, and s the distance from the top of the wedge 


446 Chapter 12 


to (the center of) the block. We clearly have 
V = constant + mg(S —s)sina, 


where S is the total length of the slanted side of the wedge (the constant, in- 
cluded because of the additional height of the center of the block above the 
wedge, is of course irrelevant, as indeed 1s S). ‘The contribution of the wedge to 
the kinetic energy T is clearly +M x?. The one thing needing a bit of thought 
(and maybe even the figure!) is the contribution of the block to the kinetic en- 
ergy T. The velocity of the block 1s given by 
Vv = (x + Scosa, —ssinag), 
so we find that 
T = 4Mx? + Sm(x? + 2X5 cosa + $7). 


Then the non-zero partials of interest are 


OT 

— =(M+m)x+mscosa 

Ox 

OT OV 

— = mxcosa +ms — = -mgsinad, 
Os Os 


and Lagrange’s equations become 


<((M + m)x + mS cos a) = 0) 


a . . 
— (mx cosa + ms) —mgsina = 0, 


dt 
immediately giving the two equations that we obtained at the top of page 220. 
Although one can easily correlate the steps in this analysis with those used on 
page 219, the solution via Lagrange’s equations can be carried out in a much 
more systematic and straightforward way. 

An even more important simplification 1s illustrated by another example from 
Chapter 6, a wheel of radius R and mass M rolling down a (stationary) inclined 
plane, with the coordinates s and @. Strictly speaking, we should regard this as 
a problem in R% for some very large N, with rolling being one constraint, and 


the fact that the wheel is a rigid body the result of a multitude of constraints. In 
Chapter 6 we handled this situation, perhaps somewhat mysteriously, by saying 


Analytical Mechanics 447 


that the mass m of the wheel is associated with s, while the moment of inertia / 
of the wheel should be associated with the angle @ by which the wheel has 
rotated. In the context of Lagrange’s equations, we simply note that the kinetic 
energy T of the wheel is the sum of its translational and rotational energy, and 
if J is the moment of inertia of the wheel about its center, then Problem 5-7 
shows that 
T= $ms* + 4] 9? 

(see Problem 2 for the more general case where the disc is moving on a plane, 
even including the more general situation where it 1s not necessarily upright). 

Our rolling constraint gives us 6 = s/R so that 


ntl oe og 
— Be oe 


and V 1s as before, giving the single Lagrange equation 





gsina 


d I 
&; (mi (1+ <5 )) = mesino => rae 
R2m 





the same equation we obtained previously. 
Admittedly, this particular example is hardly any different than the solution 
in Problem 6-8 (b). Addendum A and Problem 2 give more interesting uses. 


Conservation of energy; action. ‘he formulas for T in the previous examples, 
T = im(?? +76?) 
‘a L@ + ¢7 sin” 6) + aC cos 6 + py)? 
T = $Mx? + dm(x* + 2X5 cosa + 57) 
‘i +ms* + 116? 


are all homogeneous quadratic functions of the g', that is, if all the g’ are 
multiplied by k, then T is multiplied by k?. 

It is easy to see that this will always be the case, since the third from last 
formula on page 440 can be written as 


X*(v) =) aid’ (V) 
1=1 


for certain functions dg; , so each [x*(v)]? is a homogeneous quadratic function 
of the g’. And this turns out to have an important consequence. 


448 Chapter 12 


As a matter of notation, for a curve c in M we should write L(c’(t)), since L 
is a function on TM, with the function V really standing for V o 2. However, 
it is often helpful to write L(c(t), c’(t)) when we are dealing with a coordinate 
system, or even L(c(t)) when we considering partial derivatives with respect 
to the g', and we will allow ourselves to slip back and forth between these 
expressions without comment. 

Since we can write 


d } OL ee ee “ OL / i / 
Ger) = Dar) FCO) + Dg) G oe)" 
we see that if c satisfics Lagrange’s equations, then 
Zt 5 “.d (aL, eo) — OL, a 
qe @) = 2 dt €c @)) -q(c(t)) + L 3g8 (t))-(q oc) () 


i ne Cae iba) 
a (>: agi © (t))-qi(c «)) ; 


which implies that 
“. OL | 
y° agi O) -gi(e(t)) -L=E for a constant E. 
i=l 


Since V does not depend on the q', we have 
n n 
Yo asl’) Fc’) = Y5 Se’) -G (c'(O), 
oe i=l 4 
and since T is a homogeneous quadratic function of the g', Euler’s theorem on 
homogeneous functions! says that 


> “aa .g' =2T 
1 og | 
so that 

E=2T-—-L=T+YV, 
the usual expression for energy. 


If F(tx) = t* F(x) for x € R”, then >), x1 Di F(x) = k F(x). Proof: Take the deriva- 
tive with respect to ¢, and set ¢ = I. 


Analytical Mechanics 449 


The first part of our expression for EF, the term 


na . 
Yar) 4C'O), 


i=] 


is called the action of c (a name whose origin will be explained in Chapter 15), 
and merits further consideration in the next section. 


Time-dependent Lagrangians. In some cases we have to consider Lagrangians 
L: TM x R = R that depend explicitly on the tme t. An obvious example 
would arise if V itself depended on f, reflecting an outside force, like the forced 
oscillations of Chapter 8. 

Another common example occurs whenever we have time-dependent con- 
straints, as in the case of a bead sliding on a rotating wire, which we considered 
in Chapter 6, with gravity ignored (or equivalently, with the wire rotating in a 
horizontal plane). 

In the case of time-dependent constraints, the equations that we started with 
on page 440 need to be written as 


BS MONG cs GQ" ty, 
x” = X% o(q',...,q",4',...4",D), 


and the first gives 


Qa 


Or 





ait 


(x* oc)/(t) = previous terms + 


This gives an extra term involving 0X°/dt in the formula for X*, but it ends 
up having no effect on our equation 0x°/0q' = dx%/dq', or any subsequent 
equations in our derivation of Lagrange’s equations. 

In the case of the bead on a rotating wire, the calculations of T on page 443 


still hold, giving 


ae | 2 2Q/2 
T= 5m(r°+r°0"%), 


except that now there is the single coordinate r, while @ is a given function 
describing how the wire rotates. In this problem we have V = 0, and just the 
one Lagrange equation 


ce Le Ler, a eT 
ian oe fee ieee 


450 Chapter 12 


Although Lagrange’s equations for time-dependent Lagrangians remain the 
same, the situation for conservation of energy is quite different. ‘The first equa- 
tion on page 448 now has to be written with an extra term, 


d ; 2 OL, 
7 Le (t))=(.....0... )+ a (t)), 


sO we cannot expect conservation of energy to hold, which is not surprising for 
a bead of non-negligible mass m on a rotating wire, since some extra source of 
energy must be supplied to rotate the wire. 

However, that is only half the story. Consider a wire that is being rotated with 
constant angular velocity, 6 = w for a constant w, in which case the Lagrangian 
actually doesn’t depend explicitly on t. Now we get 


mr =ro’, 


and since 
T+V=T =5m(?? +170’) 
we find that 


d 
ae +V)=m(rF + rro*) = 2mrro’, 


which is non-zero for w # 0, as we would expect even in this special case. 
This doesn’t contradict the results of the previous section, because in cases 
involving time-dependent constraints, when we write the kinetic energy T = 
+ yoy MaX® in terms of the q’ and g’ there will be extra terms involving 0X*/0t, 
even if the Lagrangian doesn’t happen to depend explicitly on ume. For exam- 
ple, in the present case, although the kinetic energy T = $m(7? + r?67) for a 
single particle is a homogeneous quadratic form in 6 and fr, the T for the bead 


on a rotating wire, 
T = 4m(F* +1r70’7) for a given function 0, 


is not a homogeneous quadratic form in the single coordinate 7. ‘Thus, our 
identification of the constant E with T + V cannot be carried through. 


So the result of the previous section can best be expressed by saying that for 
Lagrangians that do not depend explicitly on ¢, the action minus L, 


n 


= (c'() 4c) — LE’), 
q 


i=] 
is always a constant; however, this combination will usually not be the same as 
the energy 7 + V when our constraints depend on time. 


Analytical Mechanics 451 


Lagrange multipliers. Not surprisingly, Lagrange multipliers can also be used 
with Lagrange’s equations, now using d’Alembert’s Principle for Differential 
Constraints, from page 233: 


If the constraints on a system require the tangent vector of the motion 
to lie in the subspace ker(@1,) M--- ker(@z), then there are Lagrange 
multipliers A;,...,Az such that the motions of the system under the 
external forces F satisfy 


(F —me",v) = A\@\(v) +--+ ALo@z(v) 
for all tangent vectors v at c. 


We can write this as 


N L 
Y- (Fa(c(s)) =e Maca” (t)) -dx*(c(t)) = Aor (c(t)), 


a=1 l=] 


while each w;, when restricted to 7M, can be written as 
n 
=o ay dq’ 
Yo 


for certain functions a;; describing the constraints. So on Mi) we also have 
the comphcated looking equations 


N on 
> > (Fale) - macal(t)) << 3 a deeay: 


a=1i=1 l=1i1=1 





leading to the set of equations 


N 
Yo (Fa(c() - maca"(t)) = mane@), = tan 


a=! [=] 





Then the remaining parts of the argument for Lagrange’s equations lead to 


L 


d (oL OL 
a & io = ore 


In practice this works out a lot simpler than it may look! Note that the 
index / simply refers to constraint number /, while the index i refers to the i™ 
coordinate, so, for example, when using coordinates r, 9 one would write things 
like aj, and ajg for constraint number / (cf. Problem 2). Remember also that 
one generally needs to differentiate the relations implied by the constraints, as 
was done on page 234 for the constraint equations (1) at the top of that page. 





452 Chapter 12 


ADDENDUM 12A 
LAGRANGE’S ROLLING DISC 


In contrast to the treatment in Addendum 9A, we will now use Lagrange’s 
equations, and Lagrange multipliers, to obtain equations for the general case of 
a disc rolling on a plane! (you may wish to delay this complicated analysis until 
after doing the Problems). We will be using the Euler angles as coordinates, as 
in the figure, which looks very much lke the figure of Addendum 9A, except 





that we will not be using Euler’s equations in any form, and the two vectors 
now labeled r; and rz are simply convenient reference axes. Since /’ is the 
line of nodes, when we do consider uy, we see that the Euler angle yw measures 





the rotation from the horizontal to uy, so W is the rate at which the wheel is 
rotating, 

We first want to find the components of the angular velocity vector of the 
disc with respect to the axes rj,r2,u3. The same sort of considerations that 
we used to get equation (b) at the bottom of page 363 shows that the desired 
components p, g, and r of @ are 


(A) p=89, q = dsin8, r=¢dcosd+y. 
In addition to 0, ¢, W, we will use the coordinates x, y of O, while the 
essentially redundant coordinate z of O 1s z = asin@. Using the moments 


of inertia of the disc given by Problem 5-6, together with Problem 2 (c) of this 
chapter, we see that 


T = 4m(x? + $? +.a76? cos? 6) 
(B) + imo? + imo? sin? 6 + +ma*(¢ cos 0 + y)? 
V=megasin 6. 


! From Cabannes [I]. 


Lagrange’s Rolling Disc 453 


Although Lagrange’s equations are supposed to simplify the solution of me- 
chanics problems, a lot depends on choosing good coordinates. In our problem, 
it will be convenient to express the rolling condition in terms of: the directions 
of the vector a along /, parallel to rj, making an angle of ¢ with respect to 





the x-axis; the vector b in the (x, y)-plane perpendicular to it, making an an- 
gle of @ + 1/2 with the x-axis (and pointing in the direction of the horizontal 
projection of rz); and the vertical direction. 

Then @ and the vector —ar2 from the center O to the contact point have 
components 


(6, -wsin8, ycos6 + ) and (0, —acos@, —asin @) 
so that the components of @ x (—arz) are 
(a(¢cos@ +), aOsin@, —aé cos 6), 
and the rolling conditions become (compare pages 240 and 363) 
©) Xcos + jsing + alg cos6 + Ww) =0 
—xsing@+ ycos¢?+aésinéd = 0, 
leading us to consider the 1-forms 
A,;(cosé6dx +sngddy+acos0dd+adyw), 
A2o(—singdx + cos¢dy + asin 6 dé). 
Ignoring the irrelevant coordinate z, the equations on page 451 become 
(1) [for x] mx = A;cos¢ —Az sing 
(2) [for y] my = A,sing + A2cos¢d 
(3) [for 6] ma?(6 cos? 6) + Ima? + ma?6? sin 6 cos 6 
— map? sin 8 cos 8 + ma*(¢ cos 6 + w) sin 6 
= Aza sin? — mgacos@ 
(4) [for $] +ma*( sin? 9)’ + ma?[cos 6(¢ cos 6 + W)]’ = Aya cos 4 
(5) [for vw]  ma?(dcos@ + ww)’ = Aya. 


454 Chapter 12. Addendum 12A 


Equations (1) and (2) give 


l= 
do = 


Sm(X cos@ + ysin @) 
+m(—X sing + ycos¢), 


and by using equations (C) and their derivatives, we find that 


A, = mal 6¢ sin 9 — (¢cos@ + wy] 
Ar = —ma|o(¢ cos 6 + yw) + (6 sin 0)’ |. 

Writing things in terms of the abbreviations p, g, and r introduced in equa- 
tion (A), we find from (5) that 27 — pq = 0, and, making use of this, we obtain 
the three equations 

q+ p(qcot@ —2r) =0 
3p+q(4r —qcoté) = Peas 0, 
a 


which are really equations in only three unknown functions, since p = 6. 
A disc rolling vertically along a straight line, 


6=7/2, @ = constant, w = constant, 


has p = q =r = 0 and cos@ = 0, so the equations are satisfied. 
For a solution with the disc rolling along a circle, inclined at a fixed angle, 


0 = constant, g’ = constant, w = constant, 


with p = 0, g = constant, and r = constant, the first two equations are 
automatic, while the third equation, 


q(4r —qcot@) = 392 500 
a 


now gives the condition connecting the angle at which the disc is inclined and 
the centripetal force that must be exerted in order for the disc to move in a 
circle. 
Our three equations are actually amenable to a bit of manipulation. Dividing 
the first two equations of (D) by p = d@/dt, we obtain 
dq dr 


Lagrange’s Rolling Disc 455 


leading to the single equation 
d*r de agee dr = 
dpe dg? 


from which we could also work backwards. 


It turns out that this equation can, in a sense, be solved. The substitution 
s = cos” @ changes it into the “hypergeometric equation” 


| ne i 1 3\dr 1 ‘ 
SS =) SS Pa ee, 
de So Olas A 


one solution of which is given by the infinite sum 


CoO 
a 4n2 —6 3 
r= ans", — with OT ee ee 


456 Chapter 12 


PROBLEMS 


1. Consider a pendulum for which the bob of mass m is suspended by a spring 
with spring constant k, unstretched length /, and length /+x(¢) at time f, giving 
a radial force of —kx. Find the Lagrangian, obtain the equations 


mx = m(l + x)6* + mg cos@ —kx 
m(l + x)O + 2mxo = —mg sin 0, 


and interpret them, and the underlined expressions, in terms of the rotating 
coordinate system determined by the spring. 


‘UI POPpe Id10j sIPOIIOD oy) YIIM yUIUOdUIOD eNUasUe) 

IY} SI puod—as sy) pue ‘UI pappe sdI0j [esNjINUI. ay) YUM ew= ¥ UoNeNbs dy} jo jUaU 

-oduio9 [eIpes aU} St uonenba JSITT IU T, ‘C/2XA-O soo (X+)8u+[297(X4+)+7X]7/M=T 

2. (a) On page 447 we used the fact that for a disc moving along a straight line 


(whether rolling or not), the kinetic energy T is given by T = Smx? + +1 92. 





Xx 


For the case of a disc moving upright on a plane, with /g the moment of inertia 
about the vertical axis, the obvious guess for the formula for T is 
T = (mx? + my? + 16? + Ig”), 


corresponding to the analysis on page 234. Prove this formula, again using the 
formula for Trot on page 194. 
(b) When the disc is rolling, so that we have 


x = Rcos¢ 0 dx —Rcos¢ dd = 0 
, expressed by 
y=Rsngd 0, dy — Rsing dé = 0, 


we want to use Lagrange multipliers to form 


A, (dx — Rcos¢ dé) 
Ao(dy — Rsing dé); 


in the notation on page 451, we have, e.g, dix = A1 and dog = —A2KRsin¢@. 
Show that we now obtain, up to sign, the same equations as those on page 234. 
(The case of a disc rolling down an inclined plane may be handled in exactly 
the same manner.) 

(c) Extend part (a) to the case where the disc is not necessarily upright, resulting 
in another term involving the moment of inertia about a diameter. 


Analytical Mechanics 457 


3. (a) For a pendulum with a bob of mass m on a string of length /, with 6 the 
angle from the vertical, we have 


i 1 ml?6? + mgl cosé@. 
Use Lagrange’s equation to obtain the standard pendulum equation 
ox lgsn@ = 0. 
(b) ‘fo get the tension, consider the Lagrangian 
L= Sm(r* + r767) + mgr cos 6 


on the whole plane, where r is radial distance from the pivot point, together with 
the constraint r = /, expressed by dr = 0 (so aj, = 1) and use the equations 
derived on page 451 to get 


mi — mr6? —mgcos? =A, 
or for r =/ (and r = Fr = 0) simply 
—m16? —mgcos0 = i. 


4. (a) Consider the spherical pendulum of Problem 3-5, with the path param- 
eterized by 6 and ¢. Show that 


LS 1 ml? (6? + sin? 0 #7) + mgl cos 0 





and conclude that 

I? sin* 9 is constant. 
Compare with Problem 3-5 (direct use of the coordinates x and y with V = 
mg/l? — x2 — y* makes the problem more difficult). 


(b) ‘To find the tension of the string of the pendulum, consider the full set of 
coordinates (r, 0, @), with 


We uses Vs ae OD De 
L=;m(r*°+r°0* +r’ sin” 6 $*) + mgr cos, 
and use a Lagrange multiplier for the constraint r = / to obtain 


mg cos @ + ml(6? + $* sin? 0) = —A. 


458 Chapter 12 


5. (a) For the sliding particle of Problem 6-11, we have 
| 1 m1?6? — meglsin 6. 
Conclude from the Lagrange equation for 6 that 
1767 + glcos@ = 0. 

(b) Now consider the Lagrangian for the coordinates (r, 8) in the plane, 

L= Sm(7* + r267) — mgr cos8, 
with the constraint r = 7. Obtain 

mF —mr62 + mgsin@ = A 


so that A = 0 precisely when /62 = gsin@. As in Problem 6-11, this can be 
used to determine when the particle leaves the path, though we still need the 
remainder of the analysis in that problem. 


6. (a) For the sliding stick of Problem 6-13, we have, using the value m/1?/3 for 
the moment of inertia of the stick, as mentioned on page 263, 


i mx? + smy + i ml?6? —meglsin 6. 


Writing this totally in terms of 9, use the Lagrange equation to obtain 
oe 28 
O+ 77 cos? = 0. 


(b) For L in the original form, in terms of the coordinates x, y, 0, the conditions 
that the stick touches the wall and the floor are 


x =Ilcos6 dx —lcos§6d@ =0 
or as differential constaints 


=/sin 0, dy —Isn@d0 = 0. 
Using Lagrange multipliers A; and Az to form 


Ai(dx —l cos@ dé) 
A2(dy —/sin @ dé), 


Analytical Mechanics 459 


obtain the equations 


mx =>, 
my = h2 
iml?6 +mlgcos@ = —l(A; cos 6 + Az sin @) 
= ml16?, 


which can then be combined with our previous equation. 


7. A 1-form @ on M is, strictly speaking, a function on M whose value a(p) 
for p € M isa function on Mp, but we can obviously also think of @ simply as 
a function on 7'M, which in a coordinate system has the form aan A;q/ for 
certain functions A; on the domain of the coordinate system. 


(a) If L: TM > R and we let L = L+a+C for a constant C, then for any 
curve c in M we have 


d (aL, d (aL, ~ 3A; a 
£ (Few) = 5 ae ©) % perteya! ey" 
OL j 
agi CO) = = a (t)) + ae Freon oc)'(t). 


(b) Conclude that if @ is a closed 1-form, da = 0, then for each c the Lagrange 
equations for the Lagrangians L and L are exactly the same. 
(c) Conversely, if they are always exactly the same, then a must be closed. 


This is expressed, in terminology first introduced for equations of electro- 
magnetism, by saying that Lt> L+a+4C is a “gauge transformation” of the 
Lagrangian. Note that saying that the Lagrange equations for L and L are ex- 
actly the same for all c is not the same as saying that the equations merely have 
the same solutions. For example, for TM = TR!, with standard coordinates 
x and x, note that the Lagrangians L(x, x) = ax* for various constants a all 
have the same solution curves, even though they lead to different equations. 


For a formal statement of the equations being “the same”, see Abraham and 


Marsden [1; §3.5]. 


(d) For the converse, note that if Lagrange’s equations for every curve c are the 
same for L and L, then 


au - 1) Jie 
5 (Gare) = “Sew 


460 Chapter 12 


for all c. Conclude that we must have separately 


0 0 
£ (Sew) =o amd Eee =o, 


and then that : 
L-L=) Aigi+C =a+C, 
i=] 


and finally by (c) that da = 0. 


Remark. In mechanics books, the usual, virtually incomprehensible, statement 
is that ae 


(7 ae 
err 


for some function ¢ on M. There are two points to unravel here, aside from the 
omission of the constant C. First, the closed 1-form @ is being written as dd, 
which it can be locally. Then, the equation that we would write as L = L +d¢@ 
is implicitly being applied to a tangent vector c’(t), so that we have 


L(c'(t)) = L(c'(t)) + dg(c'(t)) 


= Lew) + 18290, 


except that, as usual in physics notation, the c is being suppressed, so that the 
final term has to be written as d@/dt! The result is often stated for the general 
case of Lagrangians L(c’(t),t) that might depend explicitly on tf, probably in 
order to disguise the meaninglessness of writing d¢/dt for a function ¢@ on M. 


CHAPTER 13 
VARIATIONAL PRINCIPLES 


This 1s hke déja vu all over again. 
— Yogi Berra 


agrange’s equations will seem eerily familiar to any one acquainted with 

differential geometry, because they so closely resemble the Euler equa- 
tions, from the calculus of variations, which are used to investigate geodesics 
on Riemannian manifolds (DG, Chap. 9). And, in fact, Lagrange’s equations 
are the Euler equations for the Lagrangian L: TM —> R. 


The Euler equations. As a brief review, we recall that if we have a suitably 
differentiable function 
F:RxRxR-R 


we can seek, among all functions f: [a,b] > R with f(a) =a’ and f(b) =D’, 
for some fixed a’ and D’, one which will maximize or minimize the quantity 


b 
(f) = | F(f(t), f'().t) at. 


We do this by considering a “variation” of /f, that is, a function a: (—é,¢&) x 
[a,b] — R such that a(0,t) = f(t). The functions tf +> a(u,t) are then a family 
of functions on (—é,€) which pass through f for u = 0, and we denote these 
functions by @(u), so that @ is a function from (—é,¢€) to the set of functions 
f: [a,b] > R. If each a(u) satisfies a(u)(a) = a’, a(u)(b) = b’, in other 
words, if a(u,a) = a’ and a(u,b) = b’ for all u € (—e,¢€), then we call a a 
variation of f keeping endpoints fixed. 
For a variation a keeping endpoints fixed we want to compute 


_ ad 


b 
Oa 
a F eo d 
du -: (ew.0, Ot (u.0),t) 


adJ(a(u)) 
du 








u=0 


and seek a critical point or extremal f of J, for which this derivative is 0 for all a. 
We first move the differentiation inside the integral sign to get 


b 
LLL (eon. Seen.) 


>t da OF 07 01 OF 
=| ONE LOS C+ og ONG OS 0.0] dt, 


461 





462 Chapter 13 


and apply integration by parts to the second term to obtain 
dJ(a(u > Ba 
ee! =f Fon| Fro.F0.9 
u=0 


du 
d t 
7 (5 YoO.s0.9) | a 
b 


+200 FO. 10.0 
Uu y a 


Since the last term is 0 for variations a keeping endpoints fixed, and since 
da/du(0,t) can be any function vanishing at a and b we find that f must 
satisfy “Euler’s Equation” 


OF ; d OF / = 
a Od O.=—- (So. ©.0) = 


where 0F/0x is officially D; F, the partial derivative of F with respect to the 
second argument, and 0F'/dy is officially D2F (and 0/dt would be D3). 
The considerations are easily generalized to f: [a,b] > R” with 








b 
uf) = | F(f(t), f'()), tat for F: RR” xR” xR-R. 


We a 


(*) 





ai YF 09 [FUOLO0 


valenvoroo)]e 


b 


+E One ~(f(), f',t) 





a 


and we find that a critical point f of J must satisfy the n equations 


aF d (aF 7 
aarti al o.0- (5 )- 


where now 0/dx! denotes D; and 0/dy' denotes Dn+i. 
A surprisingly significant role will be played by a side-effect of this derivation: 





THE BOUNDARY TERM COROLLARY. For any critical point f for J we 


have 
b 


dJ(a(u)) 


du > “(0 ne SOLO. t) 


UE Gl 








a 


Variational Principles 463 


Hamilton’s principle. If (g',...,q”) is a coordinate system on M, and we ap- 
ply these results to the coordinate functions for L: TM x R —> R, the Euler 
equations translate exactly into Lagrange’s equations 


d (OL Obs». 

= are t ; t aT ee t JL — 0, 

& (gr) - SO. 

which are also called the Euler-Lagrange equations, so that the solutions c to 
Lagrange’s equations for L: TM — R are precisely the critical functions for 


t2 
J(c) =| L(c(t), c’(t), t) at. 
ty 

The shows immediately that for any L : TM xR — R, if Lagrange’s equations 
for c hold in one coordinate system, then they will also hold in an overlapping 
one. Of course, this is still a somewhat indirect proof of the “invariance” of 
these equations. We have not provided a direct interpretation of Lagrange’s 
equations, nor will we be in a position to do so until Part IV. This is not so 
surprising when one notes that if L(v) = T(v) = 4$(v,v)? for a Riemannian 
metric ( , ) on M, then J(c) is precisely the “energy” E(c) (DG, pg. 324), 
whose critical points are, by definition, the geodesics for ( , ), and an invariant 
description of their equations requires the additional machinery of connections, 
as described, ad nauseum, in DG, Vol. 2. Problem | investigates these same 
equations for the more general case where L = 3( , )* —V. 


For Lagrangians that arise from mechanics problems, the identification of the 
Euler equations with Lagrange’s equations for L is called 


Hamilton’s principle. ‘The solutions c: [t1,t2] — M for a system with 
Lagrangian L = T — V are the critical valucs for 


J(c) = [ Li(c(t), c'(t), t) dt. 


ty 


Such so-called variational principles, which may be given an interpretation 
that is teleological, saying that “Nature” always chooses some sort of optimal 
path, have often led to philosophical clashes, but modern physicists are inter- 
ested in them mainly because they often provide a useful path for extending 
investigations to other areas beyond classical mechanics, like continuum and 
fluid mechanics, and quantum mechanics (and physicists tell us that quantum 
mechanics explains how, in a sense, particles do try out all the different paths 
in order to choose the optimal one). 

The name “Hamilton’s principle” is actually something of an anachronism, 
since Hamilton’s work, the subject of Part IV, appeared nearly half a century 


464 Chapter 13 


after Mécanique Analytique was published. Lagrange stated the principle, but 
didn’t provide a name for it, describing it only as a principle “which I view not 
as a metaphysical principle but as a simple and general result of the laws of 
mechanics.” 


Maupertuis and the Principle of Least Action. Another variational principle, 
the Principle of Least Action, had been formulated by Maupertuis, quite def- 
initely as a metaphysical principle, even supposedly proving the existence of 
God, in connection with refraction of light (cf. Chapter 15), and then applied 
with appropriate modifications to mechanics. 

For a Lagrangian L: TM — R we consider a path c : [t),t2] — M and the 
integral of its action (top of page 449), 


to m) ; 
Ale) = fe) HCO ar, 
‘1 j=] 





0g! 


We want to show that c is a critical point for A ifand only if it satisfies Lagrange’s 
equations, except that the paths and the variations allowed will be quite different 
from those used in Hamilton’s principle. 

To being with, we will now consider only Lagrangians L that do not depend 
explicitly on time. 

In addition, we restrict the class of paths by requiring that the energy E = 
A—L of c has a constant value Eo, and consider only curves whose energy 
likewise have the constant value Eo. 

On the other hand, though we restrict our attention to curves that begin and 
end at the same points as c, we do not require that they are defined on the 
same time interval [t,,tf2]. So mstead of a variation a: (—é,é&) x [t1,t2] — M, 
we wil have a defined on a region D C (—é,€) x R bounded by the graphs of 
functions a; and dz > a;, where a@(u) is defined on the interval [a; (uv), a2(u)], 
and a,(0) = ft; and a2(0) = fa. [We could just as well assume that a;(u) = fy 
for all u, with only a2(u) varying. | 


to--- 





=e u 0 E 
We still want to have each a(u) go from c(t;) to c(tz2), so that for all u the 


components a = q' oa satisfy 


a! (u, dy(u)) = q' (c(ty)), p= ee 


Variational Principles 465 


and differentiating with respect to u gives 


dar! dar! j ; 
(0, tv) + =—(0, tv) (go ay) (0) = 0 
Ou ot 


or 
da! 
(A) = (0, tv) = —4' (e'()) -@!(av'(O)). 


Since all the curves @(u) are assumed to have constant energy Eo and A = 
E +L, we have 


a2(u) 


AG) = | [L(c!(t)) + Eo) dt. 


a) (u) 
When we compute the derivative of A(@(u)) at u = 0, the result corresponding 
to (*) on page 462 has an extra term coming from differentiating the limits on 
the integral, namely 


az’ (0) - [L(c'(t2)) + Eo] — a1'(0)- [L(c’(t1)) + Eol. 


Since each L + E term can be replaced by the action A = )°;(4L/0q')q', 
the g! — of this additional term is 


(B) q'(a2'(0)) waite (t2)) -g'(c'(t2)) — 4’ (ar'(0)) - ie (t1)) 4’ (c(t). 


On the other eat the final boundary term in the me corresponding to (+) 
wil be 
n 2 


(C) 








i=1 ty 


the same aaa as i 


Using (A), we see that (B) and (C) cancel each other out, so we finally obtain 
[ cFeo| Fe 
1 j= 2 ou 


= ($n ‘(t )| dt 
dq! 
for the quantity that should be 0. 

We wish to conclude, as before, that the terms in brackets must be 0, but a little 
caution is needed, as we have to choose @’s for which the @(u) all have constant 
energy Eo. In fact, we have to add the extra hypothesis that c 1s not a critical 
path for EF. The sufficiency of this hypothesis follows from rather general no- 
tions concerning infinite dimensional manifolds; Addendum A, already referred 
to in Problem 6-21, contains a lemma giving the classical argument. Q.E.D. 








(A very different approach to Maupertuis’ Principle of Least Action is mentioned 
on page 592.) 


466 Chapter 13 


It was Euler who correctly formulated the principle in a way that applied to 
mechanics, rather than light, with the additional crucial hypothesis of constant 
energy, although Maupertuis simply referred to this as “a beautiful application 
of my principle”.! 

Despite Maupertuis’ fervent advocacy for the importance of the Principle of 
Least Action, nowadays it definitely plays second fiddle to Hamilton’s principle, 
but it is not that hard to understand why his principal was once so popular in 
mechanics. In the expression for A, the term 0L/dq' is essentially momentum 
(page 445), so we are basically looking at the product of mass, velocity, and 
distance (the g'(t) dt factor). In Chapter 15 we will see how this strange com- 
bination was hit upon, but one can at least imagine wanting to minimize this 
quantity, whereas there doesn’t really seem to be any reason at all why one would 
want to minimize the Lagrangian in Hamilton’s principle! In fact, nowadays 
the alluring name of “the principle of least action” is often used for Hamilton’s 
principle, with “Maupertuis’ principle of least action” used to specifically refer 
to Maupertuls’ version. 


Jacobi’s form of the principal of least action. ‘The inherent inelegance in the 
proper statement of the Principal of Least Action may account for the murky 
presentations that were often given. Jacobi, in his Vorlesungen tiber Dynamik, 
Jacobi [1] (now available in English translation), bluntly states that “In almost all 
textbooks, even in the best, those of Pozsson, Lagrange and Laplace, the principle 
has been so presented that, in my view, it 1s impossible to understand.” Jacobi 
then gave a geometric interpretation completely eliminating the time f. 

When L = T-—V = 5 , ) ~ V for a Riemannian metric ( , ) on M, our 
result states that a curve c: [t1,t2] > M 1s a solution of Lagrange’s equations if 
and only if c¢ is a critical point, under the allowed variations, for [ ri T(c'(t)) dt. 


Since T > 0, it turns out (cf. DG, pp. 331-333 for details) that c is also a critical 
point for 


t 
| VT (c’(t)) dt = arclength of c from c(t;) to c(t2), 
ty 


and conversely, given any such critical point c, reparameterization by arclength 
will then be a critical point for the original integral, giving a formulation of the 
principle that emphasizes the dependence only on the shape of the curve. 


' Maupertuis’ biography, entailing a somewhat scandalous scientific battle with socio- 
political overtones, appears to be nearly as convoluted as his principle. See Terrall [1] 
(and for contrasting views on Konig and the disputed letters of Leibniz see the MacTutor 
History of Mathematics articles “Johann Samuel Konig” and “The Berlin Academy 
and forgery”), as well as remarks in Dugas [1]. ‘Thomas Carlyle’s humorously conde- 
scending assessment of Maupertuis, written in 19" century high literary style, is quoted 


in Chandrasekhar [2; pp. 382-384]. 


Variational Principles 467 


We can also write our integral as 


[ VT@O) VTC) a = [© VE VE) VTC) at 


[assuming that Fo, the constant energy of the curve c, satisfies Eg — V > 0 
on the image of c], showing that the solutions to Lagrange’s equations can be 
identified as reparameterizations of the curves of constant energy E = 1 that 
are geodesics in the “Jacobi metric” (Eo — V)( , ). 


Noether’s theorem. One application of Hamilton’s principle provides a very 
simple proof of an important classical result. 

Let L: TM — R be any smooth map on the tangent space of M, in terms of 
which we can write Lagrange’s equations, and consider a smooth one-parameter 
family of smooth maps ¢s: M — M; more precisely, we have a smooth map 
go: (—e,€) x M — M, and each @¢s denotes p  (s, 2) for p € M. 

The maps ¢s can also be used to produce the maps (@¢s)+ from TM to TM. 
In order to avoid a superfluity of *’s, it wil be more convenient simply to call 
these maps ®;: TM — TM. Then we can make the following definition: 


We say that the os preserve L if for every tangent vector v € TM we 
have L(®s4(v)) = L(v). 
[To find ®;,(v) for any fixed s, we take a curve y with y’(0) = v, and consider 
the tangent vector at t = 0 of th ¢s(y(t)).] 
As a trivial example, which will at least show what we are talking about, if 
M = R?, with the standard coordinates (x!, x, x*), and 


LS tm (x* Heyes ya V(x eX), 


then $s(x!, x7, x3) = (x! +5, x*, x3) preserve L. If the V term is absent, then 
bs(x1, x7, x3) = (x! +as,x? +bs, x? + cs) preserve L if (a,b,c) has length 1. 
Given a curve c in M we can consider the curves cs defined ~~ 


= 


s(t) = ds(c(t)) = G(s, c(t), } oe 
C C Sx IS 


and if the ¢s preserve L and c: R — M 1s a solution to Lagrange’s equations 
for L, then each curve cs will be also. We also have the “variation vector field” 





468 Chapter 13 


W = d¢/ds along c determined by ¢, with W(c(t)) being the tangent vector 
at O of the curve s > ¢s(c(t)). 

At each t consider 
L(c'(t) +h- W(c(t))) — Li(c'(t)) 
a 


(This can be thought of as the directional derivative of L restricted to Me) 
in the direction given by W(t).) Noether’s Theorem says that this quantity is 
constant along c (it is an “integral” for the solutions of Lagrange’s equations). 


(0) jn 


NOETHER’S THEOREM. Ifthe ¢s preserve L, then ®¢ is constant along 


any solution c of Lagrange’s equations for L. 
PROOF. Since the ¢s preserve L, each of the curves 
Cs(t) = ds(c(t)) = O(s, c(Z)) 
is also a solution of Lagrange’s equations for L, and thus an extremal for 


{ i L(c(t), c’(t), t) dt, for all a and b in the interval under consideration. 


The Boundary ‘Term Corollary on page 462 then says that for all such a 
and b, we have 
_ Saag On : "agi aL 
oe ax | agF P so that De aa is constant. % 
1=1 a 1=1 
The main problem with this proof is that it makes one question why the 
statement should be considered an important result! (for a more direct proof, see 
Arnold [2; §20]). Noether’s original paper, Noether [1], was actually concerned 
with questions of continuum mechanics rather than the mechanics of particles, 
and the version for ordinary mechanics 1s really a “toy example’, serving only 
as a template for extending the ideas to other areas, not only to continuum 
mechanics and fluid mechanics, but also to areas like quantum mechanics. 
One simple ulustration of our elementary version of Noether’s theorem con- 
cerns the cyclic coordinates discussed on page 445. If q! is cyclic, then the ¢s 
given in coordinates by (q!,...,q") > (q' +5,q7,...,q") will preserve L. In 
this case we have 0g!/ds = 1, and dg'/ds = 0 for all other i, and Noether’s 
theorem says that 


“.oL ogi aL 
3 , = 


—— - —- = —— js a constant 
dq’ os dq! ; 


so the invariance equation obtained for cyclic coordinates is a special case of 
Noether’s theorem. 


Variational Principles 469 


The lures of symmetry, advanced version. ‘The idea behind Noether’s theorem, 
of looking at the change of a Lagrangian L = T — V under a one-parameter 
family of maps, is sometimes applied in the following manner. 

Consider a closed system, where there are no outside acting forces, and 
hence V comes only from the interacting forces of the particles. For a closed 
system, the choice of the origin of our inertial system should make no differ- 
ence. ‘Io write equations for this fact it will be convenient, instead of using the 
index @, to use an index a = 1,..., K, each of which 1s a triple, with an ex- 
pression like 0L/dx% also standing for a triple. For any unit vector u € R*, we 
can consider the map ¢;: R?“ —> R?* that moves each particle over by s-u, 
conveniently written as x7 > x* + s-u. The maps ¢s, which don’t change 
any velocities, preserve the Lagrangian. So, with ( , ) now denoting the mner 
product of R?, we have 


d 
are 








s= 


; OL 
L(gse(c'(t)) = me ag (c(t)): u). 
= — \ dx 
By Lagrange’s equations, we then have 
a f0Lx. 5.3 
dX ( (53 (c'(t))). a} = 0, 


and since u is an arbitrary unit vector we can conclude that 








‘L 
» a (c’(t)) is constant. 
rs 0x4 


Since 
OL oT 


Ox? = axe 








this just says that 
es Mag(x* oc)'(t) is constant, 


a 


1.e., that momentum is conserved. (Instead of these calculations, we could have 
appealed to Noether’s Theorem, with equivalent calculations then needed to 
express the invariant.) 

Though it 1s hardly surprising that the conserved quantity just obtained 1s one 
that we already knew about, this derivation is sometimes regarded as a proof 
of conservation of momentum that relies only on the “homogeneity” of space. 
But the allure of this so-called proof is significantly diminished when we realize 


470 Chapter 13 


that the third law is already built into Lagrange’s equations, though, to be sure, 
it slips into the equations in a subtle way that is easy to miss. 

The equation (*) on page 441, originally motivated by our consideration of 
constraints, can of course be applied in the case where there are no constraints, 
to get Lagrange’s equation for a closed system, in particular for a collection of 
two particles with no outside forces. However, this equation is not a pair of 
equations on R? for the two particles, separately involving the forces Fj2 and 
F; of each particle on the other, but a single equation on R®, and this single 
equation holds precisely because we have Fiz = —F2,. For the case of constraints, 
and in particular, for the case of rigid bodies, equation (*) may appear to have 
been derived quite naturally, but again, this was only because somewhere along 
the way we used the third law. 

Needless to say, the more elegant, invariant, and sophisticated the presenta- 
tion of Lagrangian mechanics and Noether’s theorem, the easier it is to hide 
this fact, and thus appear to demonstrate that the basic laws of mechanics can 
be derived magically from symmetry. 

Invariance of the Lagrangian can be applied in the same way to the rotational 
symmetry of space, and we should not be too surprised to find (Problem 1) that 
we simply obtain the law of conservation of angular momentum, which also 
must be implicit in Lagrange’s equations, though now there is an added nasty 
twist, since conservation of angular momentum actually requires the strong form 


of the third law. 


Lagrange Multiphers for Conditional Critical Points 471 


ADDENDUM 13A 


LAGRANGE MULTIPLIERS FOR 
CONDITIONAL CRITICAL POINTS 


This material is taken from DG, Vol. 4, pp. 295-300. 

Suppose we are given two differentiable functions j,k: R” — R, and we seek 
a critical point of j on the set k~!(C). The method of Lagrange multipliers 
states that if p is a critical point of 7 on k~!(C), and p is not a critical point 
of k, then there is a number A such that 


aj dk 
So — Seto o J — 1 e e e ° 
(1) ax, (p) =A a, (p) i aan 


Problem 5-2 states a more general result, but for the moment we will simply 
repeat the argument, which is so crucial to the present discussion, for this special 
case. We note that the hypotheses on k imply that in a neighborhood of p, the 
set k~'(C) C R” is a hypersurface M, and that k.(X,p) = 0 for Xp € R”, 
precisely when Xp € My. Every such Xp is c’(0) for some curve c in M. It 


M =k"(C) 


follows that j(c(t)) has a critical point at t = 0, which means that j.(Xp) = 0. 
Thus the two linear functions jx, kK»: R”», — R have the property that kerk, C 
ker jx. This implies that 7. = Ak, for some A, which is equivalent to equation 


(1). 


Now given two functions 
F:RxRxR—R and G:RxRxR-R, 


we define, for a function f: [a,b] > R, 
b 
up) = [Fassia 


b 
K(f) = | G(t, f(t), fb) dt. 


472 Chapter 13. Addendum I13A 


Among all functions f: [a,b] — R with fixed values at a and b, and a fixed 
value K(f) = C, we seek one that 1s a critical point for J. 

We want to apply the idea of the above analysis to our two functions J and K; 
for simplicity, we assume that all functions are suitably differentiable, without 
worrying about the exact degree of differentiability required. Suppose that f 
is a critical point for J on K~'(C) but f is not a critical point of K. Consider 
any variation a: (—é,¢€) x [a,b] — R of f that keeps endpoints fixed. We know 
from the formula at the top of page 462 that d/(a(u))/ dul, depends only 


on the function da/du(0,t) on [a,b]. For W on [a, b] with W(a) = W(b) = 0, 
we define 


dJ(a(u)) for any variation a of f 


Jx(W) = du with da /du(0, t) = W(t) 


u=0 
(we should really write something like Jy,(W), but leave out the f for conve- 
nience). Notice that there always is a variation @ with this property, for example, 


a(u,t) = f(t) +uWi(t). 


Defining K,(W) in the same way, we have functions Jx, Kx: V — R, where V 
is the vector space of all functions W on [a,b] with W(a) = W(b) = 0. We 
claim that J, (and likewise K,.) 1s linear. ‘To see this we choose two variations @; 


and a> with 


0M; 
“i 0,1) = Wi(t), 
Ou 


and define the variation a by 


a(u,t) = a,(u,t) + a2(u, fr). 





Then 5 
— (0,1) = Wilt) + Wate), 
Uu 
SO 
Jn(W, + Wo) = — 
u=0 
_ adJ(a(u)) nm dJ(a@2(u)) 
- du oer du a 





as one sees by inspecting the formula on page 462 


= Jx(Wi) + Jx(W2). 


) 


Homogeneity is proved similarly. 


Lagrange Multipliers for Conditional Critical Points 473 


LEMMA. If K(f) = C, where the function f 1s not a critical point of K, and 
K,(W) = Krx(W) = 0, then W = da/du(0,t) for some variation a with the 
property that each @(u) is in K7!(C). 


PROOF. Since f 1s not a critical point, there is W; with K,(W,) 4 0. Let 
L: R* ~>Rbe 
L(r,s) = K(f +rW + sWj). 


If we define 
Blu,t) = f(t) + uW,(t), 


then f is a variation of f with 08/du(0,t) = Wi(t) and B(u) = f +uWy,. So 


Kf tu) KU) _ Moo) 


Ky(Wi) = im , ; 


Simuarly, 
OL 
K,(W) = ap (0, 0). 


Since 
L(0,0) = K(f) =C 


© (0,0) = Ke(Wi) £0. 
S 


the implicit function theorem shows that there is a function r +> s(r), from a 
neighborhood of 0 in R to a neighborhood of 0 in R, such that 


(1) C= Li(r,s(r)) = K(f +rW + s(r)W,) for small r. 


Notice that the first part of the equation gives, upon differentiating with respect 
to 7, 


v= (0.0) ale ~~ (0,0)s'(0) = K,(W) + Kx(Wi)s"(0) = Kx(Wi)s"(0), 


and hence 
s‘(0) = 0. 


Thus, if we define the variation a by 
a(u,t)= f(t) +uW(t) + sCu)Wi(t), 


then each a@(u) = f + uW + s(u)W, is in K~!(C) by (1), and also 


0,0 = W(t) + s’(0)Wi(t) = W(t). % 


474 Chapter 13. Addendum I13A 


THEOREM (Euler’s Rule). If f is a critical point of J on K~!(C) and f is 
not a critical point of K, then there is anumber A such that f is a critical point 
of J — AK (and consequently the Euler equations for J — AK hold for f/f). 


PROOF. Consider the two linear functions J,x, Kx: V > R. If Ki(W) = 0, 
let a be the variation given by the Lemma, with all @(u) in K~'(C). Since f 
is a critical point of J on K7~'(C), function u t+ J(@(u)) has a critical point 
at 0, and consequently 


dJ(a(u)) 


Tu = 0. 


u=0 


J.(W) = 


Thus ker K, C ker Jy. ‘The vector space V 1s infinite dimensional, but 1t still fol- 
lows (Problem 5-2) that there 1s a number A with J, = AK, which is equivalent 
to the assertion that f is a critical point of J —AK. % 


The straightforward extension to the case of functions f: [a,b] > R”™ is left 
to the reader. 


Variational Principles 475 


PROBLEM 
1. (a) Consider rotations A(s) for which @, the vector corresponding to the 
skew-symmetric matrix A’(0), satisfies |w| = 1, and let ¢, : R°* — R3*% be 
defined by 
ps (x") =o x xX". 

Show that 

d 

0=—|  Lidse(c'(t))) 
5 |s=0 











OL OL 
~ > ~(c(t)), @ X ca(t)) + 3 — (c'(t)), @ X ca’ (t) 
Ox 5 Ox 


0 
= (0, (cals) x Se')), 


leading to conservation of angular momentum. 

(b) Note that here we are using the standard form of Lagrange’s equations, 
which involve conservative forces, and the result 1s not so surprising when we 
note that the most common conservative forces are radially symmetric ones! 
Compare the situation for the more general form of Lagrange’s equations given 
in the footnote on page 443. 





CHAPTER 14 
SMALL OSCILLATIONS 


mall oscillations about equilibrium points, briefly alluded to in Chapter 6, 

and investigated in special cases in Chapter 8, can be given a simple unified 

approach in terms of Lagrangian mechanics, supplying a short coda to Part ITI. 
We will be considering a Lagrangian L: TM — R of the form 


ba eV sl. pa 


for a Riemannian metric ( , ) on M, written, say, as 


n 
L=> >) eyq'q’ =V. 
j=l 


We first seek equilibrium points for the Lagrange equations, that is, we ask 
when a constant curve c(t) = p € M can be a solution of the equations 


d (oT, OL 

— {| ——(c(t)) ] — => (c(t) = 0 

& (gq) ~ glee) = 0 

where c’(t) is now always the zero vector 0p € Mp. Since dT /dq' and dT /dq' 
will be a sum of terms involving at least one g/ , each of which is 0 at Op, the part 
of Lagrange’s equation involving T will automatically be 0. So, analogously to 
the situation in Chapter 6 (page 213ff.), we have an equilibrium point precisely 


when each 
OV 


dq! 





(p) = 0. 


If p is an equilibrium point, the function T > 0 on TM has a minimum value 
at Oy, so if V has a strict local minimum at zp, then the energy E = V+T 
also has a strict local minimum at Og, this minimum £9 being equal to V(z). 
For small ¢ > 0, the set of points in TM with E < Ep + « will be a small 
neighborhood of O, (we need a strict local minimum at pf to insure that the 
neighborhood is small). If we consider the component of this neighborhood that 
contains 0,, then conservation of energy says that any solution curve starting 
at a point of this component must stay in this component, which shows that 2 
is a point of stable equilibrium. 


476 


Small Oscillations 477 


(On the other hand, even if V has a strict local maximum at p, it does not 
follow that g is a point of unstable equilibrium. It is easy to see that one can’t 
expect this in the C® case, but even in the analytic case it is apparently only 
known for the easy case of dimension 1, and for dimension 2.) 

We now want to “linearize” the Lagrange equations near an equilibrium 
point, our linearization of the equations for the spherical pendulum in Chapter 8 
(page 291) being a special case. We note that 


n 


1 0a7V 
vq a) = Ma) + Yo gete! + gD ig sciagi 4"! at 





where we might as well choose V(z2) = 0, and where the first sum is 0 because 2 
is an equilibrium point, while 


se 





Bi (Gi av.e0g” )= 849) + 94 ~(p)q' +: 


and since T is quadratic in the g', the lowest nonvanishing approximation to T 
is simply 
n 
1 ope7 
x >, i(e)9'a, 
ijj=l 


giving us the linearization 


(*) Y sHmVle"O + Yo grate! 0 = = Tees 


J=1 


[More formally, linearization for a set of first order equations was already intro- 
duced in Addendum 8G, and a set of second order equations, like the Lagrange 
equations, is linearized by converting it into a system of first order equations in 
2n variables (as in Addendum 10A). Choosing a coordinate system gq so that all 
q'(p) = 0 for our equilibrium point 2, Problem 1(b) shows that (*) is indeed 
the appropriate linearization. | 

Thus, the linearized system is described by the two symmetric n x n matrices 
T = (gij(e)) and V = (Vij(R)) = (0? V/dq' dq! (p)), where the matrix T is 
positive definite. 

Symmetry of T and V means that the corresponding linear transformations 
T,V: R” — R” are self-adjoint with respect to the usual inner product on R”: 


(T(v),w) = (v,T(w)), (V(v),w) = (v, V(w)) for all v, w € R”. 


478 Chapter 14 


It follows that there is a diagonalizing basis w1,...,W», of R” such that 
(D) T (wi) = wi, V (wi) = Ai wi 
for certain Aj,...,A,. In fact, since T is positive definite, we can choose a 


orthonormal basis for the inner product (v, w) = (T(v), w). When we write V 
in this basis it 1s still symmetric, since that is still equivalent to the condition 
(V(v),w) = (v,V(w)) for all v,w € R”. So V has a basis of eigenvectors 
V1,..-, Un which are orthonormal for ( , ), with corresponding eigenvalues A;. 

If we let C : R” — R” be the linear transformation that takes the standard 
basis €1,...,€n tO U1,..., Un, then in the coordinate system C og, equations (*) 


reduce to | | 
(c')"(t) + Aje’ (t) = 0. 


When V is positive definite, implying that V has a strict local minimum at p, 
and thus that p 1s a stable equilibrium point, the general solution is the sum 
of harmonic oscillations, along axes that are orthogonal with respect to T. If V 
instead has some A; < 0, we have to allow exponential “oscillations” for each 
such A. For A; = 0 we might have an oscillation only discoverable by looking 
at higher derivatives of V, and not approximable by a harmonic oscillation, 
or it might indicate uniform linear motion (see the example at the end of this 
chapter). Including all these under the rubric of oscillations, we can then state 
the result: 


All small oscillations of a system about an equilibrium point can be written 
as the sum of small oscillations along a set of T-orthogonal axes. 


It is important to remember, however, that the standard short-hand terminology 
“small oscillations of a system”, means nothing other than “the oscillations of 
the first order approximation to a system”. As in the case of the spherical 
pendulum, we are not actually studying small oscillations of the system itself, 
and, in fact, for the general case, the system may not have any small motions 
that are oscillations. Moreover, the sum of two or more of the small oscillations 
along the T-orthogonal axes will not be periodic if the corresponding A; are not 
commensurable. (a situation similar to that for Lissajous figures in Chapter 8). 
Note that equation (D) implies that the A; are the roots of the equation 


det(V — AT) = 0, 


which connects this description of the solutions with the normal modes studied 
in Chapter 8. That is, if we look for solutions of (*) of the form 


COSA e, 


Small Oscillations 479 


with all g’ components having the same period w, we obtain 


n 


"(Vij (2) — © 8; (p))c! (t) = 0, 


f=) 


which shows that det(V — w*T) = 0, so the normal modes correspond to w; = 
Vi;. Note, by the way, that in the normal modes, all g/ components not only 
have the same period, but they are also in sync (have the same phase) for each 
pair with A; having the same sign, or in anti-syne (having a phase difference 
of z) for pairs having opposite sign. 

‘Thus we see that small oscillations of a system near an equilibrium point can 
always be written as sums of normal modes, as promised in Chapter 8, where 
we investigated the special case of N harmonic oscillators all interacting in a 
linear way. However, the method of our general proof may not provide the 
best route for analyzing small oscillations of any particular system, because the 
natural choice of coordinates for finding the Lagrangian may not end up being 
the best choice for the analysis of small oscillations. 

An extreme example is afforded by the spherical pendulum. Recall that for 
small oscillations, the coordinates x and y on page 291 seemed preferable to 
the coordinates 6 and @. As a matter of fact, in this case we can’t use 6 and @¢, 
because, as we see from the formula for v? on page 290, T is not even positive 
definite at the equilibrium point. (On the other hand (Problem 3-5), this only 
eliminates cases where the pendulum 1s actually swinging in a plane.) 

In our investigation of two coupled oscillators on page 302, the angles 6; 
and 62 that the pendulums make with the vertical are the obvious choice for 
writing the Lagrangian, but our choice of the linear coordinates x; and x2 





worked out much better when looking for normal forms, as we did in a more 
general case on page 305, because it 1s easier to make approximations in terms 
of these coordinates. We didn’t actually write down T — V in that case, but 
simply passed directly to approximating equations. Even when the general 
method is used, approximations are usually made as early as possible, as in 
Problem 2, rather than obtaining exact formulas and taking derivatives. Of 
course, such approximations can be tricky, while the longer straightforward 
method can always be relied upon. 


480 Chapter 14 


In this regard, let us consider again the case of the double pendulum discussed 
on page 307. Using the components indicated in the figure on that page, we 
have 


é x, = 1, sin 42, xo =/,sin 0; + b sin 62 
A y1 =], cos 94, y2 = 1, cos @; + ln cos b2, 


and we obtain 


T srl EO ote O. Piy2") 


= Oy 4 176" + M1112 cos(A, — 9>)6;0> 
V = —mgy1 —M2gy2 


= —(m, + m2)gl; cos 8) — m2gl2 cos O>. 


The need to calculate the tension on the string for the lower pendulum has been 
eliminated because d’Alembert’s principle for constraints has been incorporated 
into Lagrange’s equations. There is also no need to estimate the 6;, for we can 
simply calculate the derivatives at (0,0), finding that 


wea | eres 0 
t= ; V= 


mlz l> Maly 0 M2212 


which leads to the equations 


és g m2 l> 
6,+—6, = -———- —- 
ly my +m . 
: 3. 
ete ee 
I> l> 


We then obtain the equations (*) on page 308 when we switch back to x; and x2 
using (A), which for small 6; simplify to 
x] X2—X] 


1 pe 2 lo 





A final point is illustrated by the oft-presented example of the linear triatomic 
molecule, where two atoms of mass m are symmetrically located on each side 
of an atom of mass M, all three atoms lying on a straight line, with the forces 


Small Oscillations 481 


between the atoms approximated by those produced by springs with spring 
constant k (the molecules HzO and COpz fit this model reasonably well). If / 


[+x X2 [+ x3 


is the unstretched length for these springs, it 1s convenient to use x2 as the 
coordinate of the middle atom, with / + x; and / + x3 being the coordinates of 
the outer atoms. We then have 


m.. M . 
T= 5 Ga” + x37) + > 2" 


V= ee — x2)? + (x3 - x2)*], 


2 
so that 
m 0 O k —-k 0 
T=|0 M Of], V=|-k 2k -k 
0 O m O -k k 
The characteristic polynomial 
k —dm —k 0 
O = det(V — AT) = det —k 2k —AM —k 
0 —k k—Am 


reduces to 
A(k —Am)[k(M + 2m)—AmM] = 0, 


so that the w; = VA; are 


_, 7 Lk _ k ' 2m 
W, =YU, w2 = ae wW3z = aa ava hk 


In this particular case, the zero value of @; simply indicates that the whole 
molecule can be moving horizontally with uniform velocity, and it arises be- 
cause we have used three coordinates, while there are really only two degrees of 
freedom. We could remove this redundancy by adding an extra constraint, for 
example the condition that the center of mass of the molecule should remain 
stationary, but the coordinates we chose are useful for discussing the nature of 
the oscillations for the two normal modes w2 and w3, which can be found by 
solving for the eigenvectors, or in this case simply by enlightened guessing: 

(1) For w2, the middle atom remains fixed, while the outer atoms oscillate 
with frequency w2 and equal amplitude, but in anti-synch. 

(2) For w3, the two outer atoms oscillate in synch with frequency w3, with 
equal amplitude, while the middle atom oscillates in anti-synch with 
them, with 2m/M times the amplitude. 


482 Chapter 14 


PROBLEMS 


1. (a) Lagrange’s equations for a curve c are a set of second order equations 
for the c' = q' oc, but they are not written in the standard form 
d2c! 
dt? 
and in fact it may not be possible to write them this way, because the matrix of 
coefficients of the c’, 





= A;(t, c(t), dc/dt), 


0°T 

ag'q/ 
might be singular (a case normally discarded by considering only “regular” 
Lagrangians, see Chapter 16). For the case where 


L(v) = i (v, v)?-V 


for a Riemannian metric { , ) on M, so that 


n 
Le'(t)) = Yo (ey. (7) -V 
i,j=l 
the matrix, being positive definite, is certainly not singular, and we can put 
them in standard form by mimicking the procedure for putting the equations 
for geodesics into standard form (DG, pp. 326-328): If the Riemannian metric 
ovo a eer eo dq' ® dq/, with corresponding I, show that Lagrange’s 
equations can now be written as 


dc! dc! i a 
ieee ane oe en ar eee 
a 3 je) =~ 8; 


Lj? 





= 


Lj =!1 


In terms of the covariant derivative V (DG, Vol. 2, Chap. 6), the left side of the 
equation can be written as Verqyc’, and if grad V is defined by the equation 


(grad V, X) = dV(X) = X(V) for all tangent vectors X, 
our equation can be written as 
Vere = —grad V(c(t)) 
or, using the D/dt notation, 
De’ Det) _ 
dt 


(b) Use part (a) to show that (*) on page 477 is the appropriate linearization of 
the Lagrange equations near a equilibrium point. 


— grad V(c(t)). 


Small Oscillations 483 


2. (a) For the coupled oscillators on page 479, where the i** pendulum bob has 
coordinates (x;, yj) with 


Xi = / sin 6; 
y; = 11 —cos 6), 


one can write, or at least imagine the mess one would obtain by writing, the 

Lagrangian in terms of the 6;. Instead, write an approximation directly in terms 
of the 6; and m, /, and the spring constant k. 

2(10-%6) Sy +(7%0+2' 0) 75m =A 

(64.'6) Su = 


-LIMSUYy 


(b) One can now determine T and V, and then proceed to find the solutions of 
det(V — AT) = 0, but we already know the obvious normal modes for this case, 
so instead simply express T and V in terms of ¢; = 6; + 62 and ¢2 = 6; — 
and determine the oscillation periods for the two modes. 


INTERLUDE 


CHAPTER 195 
LIGHT 


[, ight!? ‘This is supposed to be a book about mechanics. What is a chapter 
about light doing here? 

Well, perhaps it doesn’t seem all that strange, since light plays a role in spe- 
cial relativity, though we don’t cover that topic in this volume, and quantum 
mechanics has joined particles and light waves in a bond that no physicist may 
put asunder. But the studies of light and mechanics have actually been closely 
intertwined throughout their history, and this interconnection helps explain the 
development of Hamiltonian mechanics, and its relation to quantum mechanics. 


Optics in antiquity. The darkness shrouding early investigations of light is amply 
described in Ronchi [1]. Euclid (~ 330-260 B.c.), in his generally valuable Opiics, 
accepted the view that the eye emitted rays (the Greeks focused on vzszon, not 
light, presuming that the eye must take an active part—as a later commentator 
asked, if light rays emanated from objects rather than the eye, how was it that 
one often couldn’t find something one was searching for, even though it was it 
plain sight?). Euclid [1] even explained that far away objects couldn’t be seen 
because they were situated between two adjacent rays spread out from the eye! 

The law of reflection, “Angle of incidence equals angle of reflection” was 
simply taken as an axiom, and the renowned mathematician, engineer, and 
inventor of ancient times Hero of Alexandria (~ 10-70 A.D), is credited with the 
observation that reflected light, in going from A to B via a point C on a mirror, 
always follows a path that makes the total distance AC+ CB shortest. ‘This may 
well be the earliest “variational principle” ever enunciated, aside from the fact 
that light travels in straight lines, to be followed 16 centuries later by yet another 
variational principle involving light. 


Islamic scholars. Though famuiar with Euclid’s Optics, Islamic scholars be- 
lieved that light was emitted from the objects one saw, the best evidence being 
the camera obscura (basically a pin-hole camera). ‘The important scholar Abu Ali 
Mohammed Ibn Al Hasan Ibn Al Haytham (~ 965-1039), known to the western 
world as Alhazen, published a book in which he compared the reflection of light 
to a body’s motion, noting that if an arrow with a small spherical body at the tip 
is shot toward a mirror on a wall “at an angle to the perpendicular we shall see 
that the arrow is reflected back not on the same line by which it came, but in a 
direction which is symmetrical to the first with reference to the perpendicular 
to the mirror.” Moreover, Alhazen explained this by resolving the motion into 


487 


488 Chapter 15 


components parallel and normal to the wall, as Galileo would do centuries later, 
with the parallel component remaining unchanged and the normal component 
reversed. A Latin translation of Alhazen’s book in 1572 played an important 
part in overthrowing the dominance of ancient Greek thought in the west. 
Kepler and Galileo. In 1604, Kepler (1571-1630), best known to most of us for 
his astronomical investigations, published a book about vision, explaining for 
the first time how the image of an object is focused by a lens, and confidently 
writing “I say that vision takes place when the image of the whole hemisphere 
of the world in front of the eye and even a little more, is formed upon the 
concave reddish surface of the retina”. But it was a mystery how the lens of the 
eye could change its focal point, and for a long time Kepler’s book was barely 
noticed, lenses being considered a matter for craftsmen, rather than scientists 
(it is not even known who invented spectacles), though the interest in telescopes 
of Galileo (1564-1642) forced a revision to the scholastic view that had regarded 
telescopes not only as déclassé, but downright misleading, totally unacceptable 
devices for use in science. ‘The third chapter of Ronchi [1], “The downfall 
of ancient optics”, gives an account of these intriguing developments, which 
inaugurated the science of optics, and swept away centuries of misunderstanding 
and confusions about light and vision, leaving us free to examine more modern 
misunderstandings and confusions. 

Descartes. ‘he first accurate law of refraction is usually attributed to work in 
the early 1600’s by Willebrord Snel van Royen (1580-1626), a Dutch mathemati- 
cian whose Latin name “Snellius” led to the spelling Snell, used by everyone 
except au courant scholars.! Considering a ray of light AB in air that is re- 
fracted along BC after hitting water at B, and letting BD be the extension of 
AB to the same horizontal distance as C, Snell found that the ratio BC/ BD was 





always the same. Writing BC/BD = (BE/BD)/(BE/BC), we can express the 
result as the “sine law” ; 
sin a 


sin B 





= constant. 





Descartes (1596-1650) reached exactly this conclusion, and the suggestion that 
he might have borrowed from Snell excited some nationalistic feelings, so that 


' Recent scholarship has also called attention to the work of Ibn Sahl in 984 and Thomas 
Harriot in 1602. 


Light 489 


the sine law of refraction is generally known as Snell’s Law, but in France as 
Descartes’ Law. In any case, Descartes gave an argument for the law, although 
his views on light were rather unformed and obscure, generally regarding light 
as a pressure, rather than a movement of particles. Like Alhazen, he resolved 
the motion (or whatever) of the light into components parallel and normal to 
the line separating the air and water, with the parallel component remaining 
unchanged, while the other component, rather than being reflected, changed 
magnitude so that the resultant velocity would be the speed of light in water. 
This led to the result that the ratio sina/ sin B for the incident and refracted 
angles is a constant, namely, the ratio v2/v1 of the speed of light in water to the 
speed of light in air. 

Alhazen had proposed a somewhat similar explanation for refraction, but it 
seems he didn’t recognize the difficulty that this explanation entailed: since the 
angle of refraction is less than the angle of incidence, the speed of light would 
have to be greater in water than in air. Descartes did recognize the difficulty, 
admitting “Perhaps you would be surprised if you carried out the experiment’, 
and resorted to considerable extemporizing (Ronchi [1; pg. 117]) to support the 
idea that the speed should be greater in water than in air, which everyone, 
including Descartes himself and his supporters, found rather unnatural. 


Fermat. Fermat (1601-1695) rejected these arguments completely. He not only 
regarded Descartes’ view of the velocities as absurd, but also objected to reliance 
on mechanical analogies, saying that one should start “from the principle, so 
common and so well-established, that Nature always acts in the shortest ways”, 
nowadays ensconced in Fermat’s principle: the path followed by light is always 
a critical path for the time. But there were obstacles to overcome. 

Assured that experiments confirmed Descartes’ law—“All the difficulty was 
therefore reduced to the fact that it appeared that I had to fight not only men 
but also Nature” —Fermat only reluctantly decided to see if his principle at least 
gave a law that agreed with Descartes’ law within experimental error. His re- 
luctance was partly due to the fact that finding the path of least time was not 
so simple as in the case of reflection. Fermat had “my method of maximis and 
minimis, which is rather successful for expediting this kind of problem”’, basically 
setting a derivative to zero [Newton read, and generalized, this method], but 
finding the equivalent of the derivative is not so simple when the square roots 
have to be dealt with directly, without benefit of the chain rule and basic facts 
about derivatives. However, at the end of the calculations he found that “the 
reward for my effort has been the most extraordinary, the most unforseen, and 
the happiest that ever was”, namely that the path taking the least time followed 
exactly Descartes’ law for diffraction in the form sina/ sin B = v1/v2, but with 
the assumption that the speed of light in water is /ess than in air, v2 < vy. 


490 Chapter 15 


Fermat’s calculations, in an unpublished paper Analysis ad Refractiones at- 
tached to correspondence, is given in Sabra [1; pp. 144ff£.], where one can clearly 
see the precursor of the derivative. (In some later correspondence, a paper 
Synthesis ad refractiones took the classically preferred opposite tack, giving a di- 
rect, rather complicated, argument that the path satisfying the sine law takes 
less time than any other, cf. Sabra [1; pp. 150-152] or Dugas [1; pp. 255—257].) 

The supporters of Descartes opposed Fermat’s explanation as disdainfully as 
Lagrange, a century later, would dismiss the significance of the principle of Least 
Action, which, as we will see later on, stemmed from Fermat’s principle in a 
perverted sort of way; and they gave the same sort of criticisms that most of us 
would give to such teleological principles today. Feynman [1; pp. 26-7 to 26-8] 
has a discussion of Fermat’s principle from a viewpoint informed by modern 
physics, which gives it something of an explanatory role, rather than a mere 
metaphysical one. 


Huygens. Christiaan Huygens (1629-1695), one of the men mentioned in the 
Principia as “the foremost geometers of the previous generation” (cf. page 22), 
was once referred to by Newton as “Summus Hugenius” for his exposition, and 
Chapter 8 gives some examples of his mastery of geometry. 

In 1678, Huygens published a short book, Yrazté de lumiere, available in English 
translation as Treatise on Light, Huygens [2], and a comparison with the Principia 
reveals the extent of the generational divide. ‘The Principia is difficult to read 
not only because Newton uses his geometric prowess to provide many proofs, 
but also because he states and derives almost all formulas in words, often ac- 
companied by geometric figures. In Huygens work, by contrast, there basically 
are no formulas at all—everything is geometry. 

Huygens objected to the notion that light consists of streams of particles, 
because there are obviously light rays simultaneously traveling in essentially all 
directions, including directly opposing directions, as when two people see each 
other, so it would seem impossible for these particles not to interfere with each 
other; and Descartes’ vague ideas had similar problems. 

Huygens instead proposed that light must spread something like sound in the 
air, produced “by a movement which is passed on successively from one part 
of the air to another; and that the spreading of this movement, taking place 
equally rapidly on all sides, ought to form spherical surfaces ever enlarging and 
which strike our ears”. For the analogous surfaces in the case of light, he chose 
the word ‘waves’ [what we would call wave fronts] “from their resemblance to 
those which are seen to be formed in water when a stone is thrown into it, and 
which present a successive spreading as circles, though these arise from another 
cause, and are only in a flat surface”, perhaps the first use of the word ‘waves’ 
in any situation other than water waves. ‘Though differing in nature, waves in 


Light 491 


water give a very nice example of waves in different directions passing through 
each other to supplement the evidence from sound waves. 

The speed of light had been estimated a little earlier, by Roemer in 1676, 
from the delay in the eclipses of Jupiter’s satellites, and Huygens explained the 
possibility of such a great, but finite, speed as Roemer’s estimate gave by pointing 
out that when a single spherical object hits a line of identical touching ones, 
the motion passes “as in an instant to the last of them”, though it is certainly 
not instantaneous, but successive, for if the movement “did not pass successively 
through all these spheres, they would all acquire the movement at the same time, 
and hence would all advance together”. He thus decided that the “particles of 
the ether”, whose motion presumably resulted in the phenomenon of light, must 
be “as nearly approaching to perfect hardness and possessing a springiness as 
prompt as we choose.” [I.e., they are perfectly elastic rigid bodies.] 

Huygens also pointed out that the particles of the ether are not arranged in 
straight lines, like a row of spheres “but confusedly, so that one of them touches 
several others’, as in (a), so when B hits A it will come to a stop, while A will 
impart its motion to “all the spheres GCC which touch it”. So each particle 
in the path of a wave will create its own secondary wave, a popular idea of the 
time for water waves, and a basic tenet of his theory. Having established this 


© 
OG 





(a) (b) 

foundation, Huygens next had to explain why light, if it is a wave motion, travels 
in straight lines, unlike sound. Considering a light source A and an opening 
BG delimited by opaque barriers HB and GI, as in (b), he points out that the 
motion should remain in the region bounded by CE, except that we also have 
to consider the fact that new waves are created at each point ‘b’ of BG, not to 
mention points “d’ further along. Huygens reasoned that the part of the motion 
outside the region is very feeble compared to the part along CE, the envelope 
of all these secondary waves (for a review of envelopes, see Addendum 8B). 

Noting that in the case of light, BG may be taken extremely small compared 
to the distance to the source A, “since this opening is always large enough to 
contain a great number of particles of the ethereal matter, which are of an 


492 Chapter 1D 


inconceivable smallness”, he concludes “then we may take the rays of light as 
if they were straight lines”. At this point, the “rays” have essentially become 
the orthogonal trajectories of the wave fronts, which are the envelopes of all the 
emitted spheres. (The concept of a light “ray” had a tortuous history, which 
informs the whole first chapter of Buchwald [1].) 


The creation of a new wave front by taking the envelope of secondary waves, 
‘Huygens’ construction”, is the second basic idea of Huygens’ theory, and he 
adds wistfully, “And all this ought not to seem fraught with too much minuteness 
or subtlety, since we shall see in the sequel that all the properties of Light, and 
everything pertaining to its reflection and its refraction, can be explained in 
principle by this means.” 


Despite the highly speculative nature of the whole discussion, which more 
closely resembles a stream of consciousness novel than a scientific exposition, 
this vague mechanistic description is actually rather concrete compared to many 
of the fantastical views of light until then.! Moreover, after this discussion, which 
ends the first chapter of Huygens’ book, the ideas are used in the short second 
chapter to give a simple explanation of the law of reflection. 


For simplicity, we draw diagrams in two dimensions; the proper additional 
considerations for three dimensions will be left to the reader—or they may be 
found in Huygens [2; pp. 31-32]. 

We consider (a) two rays, perpendicular to the wave front AC, one of which 
has just hit the mirror at A, while the other will not hit the mirror until it gets 
to B. The ray that has hit the mirror at A now starts the creation of a circle of 


radius = CB 
/ C / c J } C N 
A B A B A B 
(a) (b) (c) 


increasing radius centered at A, and at the moment (b) that the other ray hits the 
mirror at B, the radius of this circle will be CB. If we draw the line BN tangent 
to this circle, as in (c), then AN = CB, so the right triangles ACB and ANB 


'Tn fact, the whole history of the theory of light is a comedy of errors (and perhaps a 
prophetic warning), involving many strange philosophical fantasies and extravagances, 
with one theory after another being replaced by a new theory with its own difficulties, 
culminating in an exquisitely detailed theory with great predictive powers, which was 
nevertheless soon supplanted by a different theory—see Addendum A. 


Light 493 


with common side AB are congruent, and thus we have ZCBA = ZNAB, and 
also ZNBA = ZCAB. 

But the same argument holds for any other ray perpendicular to the wave 
front, as in (d), intersecting the mirror at A’, say: when our ray CB hits the 





mirror at B, the tangent BN’ from B to the circle started at A’ will make an 
angle ZN’BA’ = ZC'A'B = ZCAB, and thus (e) this tangent line will be the 
same as the tangent line BN. 

In other words, BN is the envelope of all the new circles produced by the 
various rays, So it 1s the reflected wave front, with rays making the angle ZNAB 
with the mirror. 


The third chapter begins with further speculative arguments, which we wil 
pass over, to show that the speed of light in denser matter should be slower, and 
then explains the law of refraction by the following geomctric argument, using 
the same principles as that for the law of reflection. 

We consider a wave front AC again, where the ray DA hits the boundary 
between the air and water at A, producing an expanding series of circles (a). 





At the time that the ray CB hits the boundary at B the radius of the circle is 
(1/k)-CB, where the speed of light in water v2 is 1/k times v1, the speed of 
light in air. As before, we draw the tangent line BN to this circle. 

If we consider another ray hitting the boundary at A’, as in (b), and draw 
A'N’ perpendicular to BN, then by similar triangles, A’N’ = (1/k)-C’B, which 
is just the radius of the circle centered at A’ when the ray CB hits the boundary 
at B. ‘Thus we see that BN is now the envelope of all the circles produced 


494 Chapter 15 


by the refracted rays. Finally, for the angle of incidence a and the angle of 
refraction B we then have 


sna = sin ZBAC = CB/AB 
sin B = sin ZABN = AN/AB 


sin &@ CB V4 





snB AN v> 





In the short fourth chapter, Huygens discusses phenomena arising from the 
varying density of the atmosphere (though not mirages, first widely known in 
Europe after Napoleon’s Egypt campaign). Because of the variation in the speed 
of light at different levels of the atmosphere, the diameters of the secondary 
waves will vary as light descends toward the earth, causing the light rays to de- 
part from straight lines, though Huygens’ analysis is merely descriptive, without 
any formulas that would allow one to check predictions. 

The longest, most difficult, and most impressive, chapter is the fifth, which is 
an investigation of Iceland spar, a crystal with three pairs of parallel faces, all 


EY © 


parallelograms having the same obtuse angles (101°52"); it is easily split along 
planes parallel to the faces, so that one can even get a piece with all six faces 
being congruent rhombuses. ‘The distinctive property of Iceland spar is that it 
produces a double image, especially noticeable when it is placed on a printed 
page. Huygens discovered that a ray of light hitting a face of the crystal is split 
into two rays, one of which obeys Snell’s law, while for the other, the ratio of 
the sines depends on the inclination of the initial ray, and he decided that this 
second ray must therefore not come from spherical surfaces but from a more 
general shape, for which he tried an ellipsoid of revolution. ‘The analysis that he 
then gave, explaining a phenomenon earlier theories could not, is now available 
in Huygens [2]; a critical discussion of this analysis may be found in Appendix | 
of Buchwald [I]. 

Mathematically, to accommodate both “non-homogeneous” media, like the 
atmosphere, and “non-isotropic” media like Iceland spar, we must allow the 
wave fronts created at various points to have non-spherical shapes and also vary 
in size. We could do this by considering a Riemannian metric on some region 
of R?, with the ellipsoid representing the unit sphere at each point indicating the 


Light 495 


distance light travels in unit time in the various directions from that point. More 
generally, we can consider an “indicatrix” at each point, a surface bounding a 
body radially symmetric about the point, indicating the distance light travels 


EP SED 
S 
Coe 


in various directions from that point in unit time; for simplicity we will assume 
the indicatrix 1s a smooth surface bounding a convex body. (By declaring vec- 
tors lying on the indicatrices to be unit vectors, we can then assign lengths to 
vectors at each point—basically a Finsler metric, as in DG, Vol. 2—on some 
region of R?.) 

When the indicatrices vary from point to point, Huygens’ construction must 
be regarded as preceeding by “infinitesimal steps”. For example, to construct 


Mm C7 th 


the wave fronts for light emanating from a single point p, we first consider the 
wave front W; around p after time ¢, which will just be € times the unit sphere 
of our metric at p. Then we take the envelope of the wave fronts after time ¢ 
around each point of W, to get the wave front W2, and then take the envelope 
of the wave fronts after trme ¢ around each point of W2 to get the wave front 
W3, ... And then, finally, we take the limit of this sequence of constructions as 
¢ — 0. Huygens’ explanations of reflection and refraction basically used such 
constructions—for the case of discontinuously changing indicatrices. 

One clear reason why Huygens’ theory found few adherents was that its to- 
tally geometric formulation made computations almost impossible. In addition, 
however, at the end of his analysis of Iceland spar, he added “one more mar- 
vellous phenomenon which I discovered after having written all the foregoing”, 
which was inexplicable on the basis of his theory—he had basically discovered 
polarization, see Addendum A. Until Huygens’ original picture was given a 
complete overhaul later on, this provided yet another strong argument for the 
main competition, the corpuscular theory, formulated by our final luminary. 





Newton. Color had always been regarded as something different from light, 
perhaps a companion to light, or something that light elicited from a body 
when it illuminated it. ‘This view was demolished by Newton (1642-1727), who 
first achieved fame around 1671 with his reports on his investigations into light, 
showing that white light was a mixture of the various colors. These investi- 
gations had begun around 1666, although his famous Opticks wasn’t published 


496 Chapter 1D 


until 1704, and his experimental work was carried out with such great accuracy 
and ability that his views, and his corpuscular theory of light, attained almost 
incontestable authority, especially after the publication of the Principza. 
Newton also made many discoveries—often not further investigated—that did 
not fit in very well with his corpuscular theory, cleverly ending his Opticks with 
31 ‘Queries’, taking 60 pages, which leave the task of examining these questions 
to his successors, so that, as Ronchi [1] puts it, “no one could have worked better 
than Newton, not to build, but rather to demolish, the corpuscular theory.” 


At this point we merely want to note that Newton’s argument for Snell’s 
law was a mechanical analogy similar to Descartes’, again involving a greater 
speed of light in more dense material— Newton even suggested that the light 
corpuscules were attracted by the denser material. Because of this, both sides 
of the debate between wave theories and corpuscular theories felt that accurately 
comparing the speed of light in water and in air (which at the time probably 
seemed close to hopeless) would decisively decide which theory would prevail. 

As it turns out, this determination, made independently by Foucault and by 
Fiseau and Breguet (cf. ‘Tobin [1; Chap. 8]), occurred in 1850, by which time 
the wave theory had essentially already triumphed, though, as briefly described 
in Addendum A, it was greatly changed from Huygens’ original conception. 


In the meantime, however, Newton’s authority for the speed of light being 
greater in denser matter led, by a circuitous route, to a most modern-looking 
variational principal. 


Maupertuis. Maupertuis (1698-1759) shared Fermat’s taste for laws of physics 
derived from first principles, but the basis for Fermat’s argument was destroyed 
by Newton’s theory, as Maupertuis pointedly noted in a paper of 1744, The 
agreement between the different laws of Nature that had, until now, seemed incompatible. 


If one is bent on deducing the law of refraction from a variational principle 
that assumes the speed v, of light in air is less than the speed v2 of light in 
water, it is not hard to concoct one. If we consider 


v1}: AB+02-BC = vV]\2 +x? + v2Vb? + (a- x)? 
= f(x), 





then setting f’(x) = 0 immediately gives the desired result 


sin a v2 
=—>] 
snp v4 





? 


sO we minimize the sum of velocities times distances on the path from A to C. 


Light 497 


Maupertuis justified this as follows (Dugas [1; pp. 262—263)): 


In meditating deeply on this matter, [ thought that, since hght has 
already forsaken the shortest path when it goes from one medium to 
another—the path which is a straight line—it could just as well not 
follow that of the shortest time. Indeed, what preference can there be 
in this matter for time or distance? Light cannot at once travel along 
the shortest path and along that of the shortest time—why should it go 
by one of these paths rather than by the other? Further, why should 
it follow either of these two? It chooses a path which has a very real 
advantage—the path which it takes 1s that by which the quantity of action 
as the least. 

It must now be explained what I mean by the quantity of action. 
When a body 1s carried from one point to another a certain action is 
necessary. [his action depends on the velocity that the body has and 
the distance that it travels, but is neither the velocity nor the distance 
taken separately. ‘The quantity of action is the greater as the velocity is 
the greater and the path which it travels is the longer. It is proportional 
to the sum of the distances, each one multiphed by the velocity with 
which the body travels along it. 

It 1s this quantity of action which 1s Nature’s true storehouse, and which tt 
economises as much as possible in the motion of light. 


For a system of particles, with the velocity replaced by momentum to take 
into account the mass of the particles, the “action” defined by Maupertuis will 
be the action integral A (pages 464 and 466). Thus, in this roundabout way, 
Maupertuis succeeded in producing an incorrect principle for light that turned 
out to give a correct principle for mechanics—when given the proper formula- 
tion by Euler, with paths of constant energy. 

Since the quantity to be minimized for Fermat’s principle, the total time, 
is simply (1/v1) - AB + (1/v2) - BC, Maupertuis had basically just replaced 
1/v by v to get a principle consistent with the assumption v2 > vy. Conflicts 
between the two views could thus be neatly finessed by minimizing the quantity 
r,:AB+1r2- BC, where 7; 1s the “refractive index” of the i th medium, leaving 
aside the question of whether this is proportional to velocity, for Maupertuis, or 
to its reciprocal, for Fermat. 


Maupertuis certainly had no doubts on the matter. Unconstrained by the 
defect of false modesty, he wrote much later in his E'ssaz de Cosmologie of 1751: 


After so many great men have worked on this matter, I hardly dare 
say I have discovered the principle on which all the laws of motion are 
founded; ... Our principle, more in conformity with the ideas of things 


498 Chapter 15 


that we should have, leaves the world in its natural need of the Creator, 
and is a necessary result of the wisest doing of that same power. ... 
What satisfaction for the human mind, in contemplating these laws— 
so beautiful and so stimple—that they may be the only ones that the 
Creator and the Director of things has established in matter in order 
to accomplish all the phenomena of the visible world. 


To bring the story full-circle, we temporarily trespass into the domain of 
modern physics. Maupertuis’ action has dimensions Mass x Velocity x Length, 
or MVL = ML?T~', but nowadays usually written as MV-VT = MV7T = ET (where 
E = Energy), like the fundamental constant h = 6.626- 10~*’ergsec, which 
Planck introduced to solve a very specific classical problem, but which eventually 
became a cornerstone of quantum mechanics, somewhat to his chagrin. 

And, undoubtedly to the chagrin of many others, Planck, like Maupertuis, 
believed that an argument for the existence of a supreme being can be found 
in teleological principles, with special reference to the principle of least action 
(except that Planck was using this term to refer not to Maupertuis’ principle 
of least action, but to what we now call Hamulton’s principle)—see his lengthy 
essay “Religion and Natural Science” in Planck [1]. 


Malus. After Kepler, Galileo, Descartes, Fermat, Huygens, and Newton, with 
a detour to Maupertuis, we come to a much lesser known scientist whose work 
will provide us with a link between the study of light and modern mechanics, 
Etienne-Louis Malus (1775-1812), never mentioned in mechanics books, and not 
all that often in texts on optics. 

He receives much more attention in Buchwald [1, pp. 23ff], which describes 
the horrific circumstances leading to his study of optics, emphasizes his role in 
turning experimental optics from an activity using hardly any mathematics and 
unconcerned with obtaining or presenting hard data, into a modern science, 
and describes his work in detail, especially his work on polarization. 

A graduate of the newly founded Ecole Polytechnique in Paris, where the 
chemistry of the time was one of the subjects of instruction, his original hope 
was to prove that light is a compound of caloric and oxygen (I kid you not), but 
happuly the first paper he presented in Paris was not concerned with the physical 
structure of light at all. He considered rays of light emanating from a single point 
that encounter the surface of a body that either reflects or refracts them, and 
proved that the resulting rays are all normal to some surface. ‘Though sounding 






“<—— orthogonal surface 
refracting surface ——> 


Light 499 


similar to Huygens’ ideas, this surface was not thought of as a wave front, but 
was simply a mathematical entity that happened to exist as a consequence of the 
standard laws of reflection and refraction, and Malus’s proof was a complicated 
analytic derivation from these laws. 

The interest for optics was presumably in following the rays in the opposite 
direction, to deduce that they can all be focused at a point only if they were all 
orthogonal to some surface. In any case, Malus’ paper was apparently received 
very favorably, and must have generalized many specific optical investigations, 
for the committee of Laplace, Lagrange, Monge, and Lacroix chosen by the 
Académie des Sciences to read the paper concluded: 


To apply thus, without any limitation on its generality, calculation to 
phenomena;—to deduce, from a single consideration of a very general 
kind, all the solutions which before were only obtained from particular 
considerations,—is truly to write a treatise on analytical optics, which, 
considering the whole science in a single point of view, cannot but 
contribute to the extension of its domain. 


Malus did not try to show that if this new collection of rays undergoes another 
reflection or refraction then there will again be a surface normal to the resulting 
rays, let alone the more general claim that any system of rays orthogonal to a 
surface becomes another such system under a reflection or refraction. In fact, 
an error in his reasoning led Malus to the conclusion that his result definitely 
could not be extended to a second reflection or refraction. 

But in 1816 Dupin proved the result for reflection of any given system of rays 
orthogonal to a surface (which so upset the Académie that they appointed a 
special investigative committee, which discovered Malus’ error), and he conjec- 
tured that it was also true in the case of refractions. 

Proofs of this were given in 1825 by Quetelt and Gergonne (further details 
may be found in Atzema [1]), establishing “Malus’ theorem” for any number of 
reflections and refractions. 


i, ortunately for the science of mechanics, all these later papers were unknown 
to an Irish mathematical prodigy who had read Malus’ paper with interest. 


500 Chapter 15 


ADDENDUM 15A 
BATTLING TO A DRAW 


“Geometrical optics” is the term for the analysis of light that depends only on 
the laws of reflection and refraction—particularly important for the design of 
mirrors and lenses—by investigating the way that a family of light rays behaves 
as it hits a mirror or passes through a lens, ignoring as much as possible the 
many subtle phenomena concerning light. 

Geometrical optics is still an important and challenging field of study, but 
in the first half of the 19 century a theory of light was finally created that 
could explain the various subtleties, a wave theory of light that eventually totally 
displaced all previous theories, including Huygens’ wave theory, though aspects 
of Huygens’ theory were incorporated into this new theory. 


The first important phenomenon about light that presented a challenge was 
diffraction, the fact that light does not travel in exact straight lines, but does 
bend a bit, just like sound waves, although observing this is much more difh- 
cult because of the short wave lengths involved, and many complicated effects 
are observed, especially because different colors diffract differently. It was first 
discovered, probably accidentally, by Father Francesco Maria Grimaldi (1618- 
1663), whose book on the subject of light was published in 1665, and ignored 
for a long time, though his work involving colors may have instigated Newton’s 
original experiments. 

On the other hand, the second important phenomenon that has to be ac- 
counted for is the “marvellous phenomenon” that Huygens discovered about 
Iceland spar (page 495). He noted that when a ray of light was split into two 
rays as it passed through the crystal, these rays no longer acted like ordinary 
light rays in terms of their passage through another crystal, and in particular 
acted quite differently when the second crystal was perpendicular to the first 
rather than parallel to it. ‘This phenomenon of polarization, where light rays 
seemed to have a specific orientation in the plane perpendicular to the direc- 
tion of their propagation, seemed wholly incompatible with Huygens’ picture 
of wave prorogation. 


Thus Newton’s corpuscular theory had remained dominant, though experts 
knew of all sorts of troubling difficulties, until the beginning of the 19"" century, 
when the third important phenomenon was discovered. In 1802, the English 
physician ‘Thomas Young (1773-1829) published his first account of new exper- 
iments that called the corpuscular theory into question, and he soon produced 
the famous Young interference experiment, where light from a point source 
passing through two small holes close to each other produces an interference 


Battling to a Draw 501 


pattern, from which Young could even calculate wave lengths of different colors, 
these waves being assumed transversal to the direction of propagation, rather 
than longitudinal like sound waves, to which light had so often been compared 
over the centuries. hese heretical ideas were naturally denounced by the stal- 
wart defenders of the Newtonian dogma, and his ideas were roundly derided. 

Fortunately, somewhat similar ideas put forth at about the same time by the 
French civil engineer Augustin Fresnel (1788-1827) elicited a much more encour- 
aging reception, and Fresnel rapidly began to develop a complex new theory 
involving intricate calculations, also invigorating Young to add further contri- 
butions. ‘The assumption of transversal waves enabled the phenomenon of po- 
larization to be accounted for, and Fresnel was able to account for diffraction 
phenomena by adding in Huygens’ construction, together with the complicating 
factor that the various secondary waves would interfere with each other. 

Thus, in the brief space of the quarter century between Young’s initial paper 
and Fresnel’s early death, an English physician and a French civil engineer 
succeeded in creating a detailed elaborate theory destined to quickly displace 
all previous theories, and the experiments carried out in 1850, establishing that 
the speed of light in air is greater than the speed in water, were simply the 
coup de grace that finally silenced the last stubborn adherents of Newton’s 
theory. 


And then the whole apparatus of the new wave theory was subsumed, at least 
theoretically, around 1864, by Maxwell’s theory of electromagnetic waves, with 
investigations by Kirchhoff and others establishing that optics may be deduced 
as the limiting case of very small wave lengths, and the entire idea of secondary 
waves was upended (though calculations using Fresnel’s theory are still often 
the only reasonable mathematical approach to finding solutions). One can now 
look at the telescope of history from the other end, as in Born and Wolf [1], 
which begins with Maxwell’s equations on page 1, and then goes on to give 
detailed and thorough discussions of a large class of optical phenomena from 
the point of view of the wave equation. 


That wasn’t the end, of course, simce modern quantum theory says that light 
is a wave and a particle. Perhaps some might still say, with Grimaldi, “let us 
be honest we do not really know anything about the nature of light and it is 
dishonest to use big words [or equations] which are meaningless.” 


502 Chapter 15 


ADDENDUM 15B 
HUYGENS’ PRINCIPLE 


We have followed Born and Wolf [1] in referring to “Huygens’ construction”, 
because once the wave equation became the basis for the theory of light it was 
possible to pose explicit mathematical questions, quite unrelated to Huygens’ 
original mechanistic ideas, which led to results about partial differential equa- 
tions that now, somewhat inexplicably, have Huygens’ name attached to them. 


In Chapter 8 we considered the 1-dimensional wave equation 
Urt — V-Uxx = 0, 
and (Problem 8-5) we found that in terms of initial conditions 
u(x,0) = P(x), ur (x, 0) = W(x) 
we have d’Alembert’s formula 


_ x+vt 
neay= o(x + vt) + (x — vt) i" = | 


5 ay wis) ds. 


—vuti 


This formula shows (a) that for a point x and a time f¢, the value of u(x, ft) 
depends on the values of ¢@ and y on the interval [x — vt, x + vt], the domain 
of dependence of (x,t). Or (b), we can consider the range of influence of x, the 













range of is ee 


domain of A EGE 


dependence 





set of all points (x1,¢) whose domain of dependence contains x. The inverted 
triangular shape indicates a finite propagation speed: for x; > x the value of 
u(x,t) at points (x1,¢) with t < (x; — x)/v 1s completely independent of (x) 
and w(x). On the other hand, notice that the value of u(x,t) for all t > 
(x; — x)/v does depend on these values; in a 1-dimensional world a brief noise 
at x at time t = 0 will continue to have reverberations at x; forever after time 
(x; — x)/v, which is certainly not what we experience in our 3-dimensional 
world. 


Huygens’ Principle 503 


There is a neat analytical trick, Poisson’s method of spherical means, that en- 
ables us to obtain the solution for the 3-dimensional wave equation, 


2 y 
Upp = VO AU = VU (Uxx + Uyy + Uzz), 


from those of the 1-dimensional wave equation. We follow the exposition in 


McOwen [1] pretty closely. 


(a) Spherical means. For a continuous function f on R”, its spherical mean 
over a sphere with center x and radius r is 





(1) My (x,r) = I(x + r&) dS), 


WOn—1 J\é|=1 


where dS denotes the (” — 1)-dimensional volume element on the unit sphere 
S?—) CR", and w,_} is the total n — 1-dimensional volume of S”~!. Although 
M(x, 0) isn’t defined, it is easy to see that 


2) lim My(x,r) = f(x). 
From (1) we obtain 


Wn—1 J\é|=1 





S— fej (x + rE dS(§), 


i=] 


0 
a, Mr (x7) — 


and the integrand can be written as (X, v), where v is the unit outward normal 
on the unit n-disk and X is the vector field 


x= (25t" . ao). 


OX] OXn 
The divergence of X (page 138) is 
divX =r 0 fexi(x +r&) = rAx f(x +r), 
i=l 
so the divergence theorem (page 139) gives 


r 





0 
—M os 


dxf fx +rédé. 
|§|<1 
The substitution &" = ré, d&’ = r"dé& gives 


| ious / fe +8) dé’, 
E|<1 \E’|<1 


| rn 


504 Chapter 15. Addendum 15B 


and in spherical coordinates we can write 


n—l1 
eye, LOE t EAE = =e i fx + pb) dS(E) dp 
Wn—1 


= - [ pe” My(x, p) dp. 





Substituting back into (3) we obtain 


d I aed 
we Msn) = sas | pv My (x, p) dp. 


Finally, multiplying by r”~!, taking 0/dr, and dividing by r”~!, we get the 
Darboux equation 


a? —1a 
(4) Gat ; se) Mr r) = AxMf(x,1r). 


(b) Application to the wave equation. Now consider the equation 
(5) Ute = v*Au 


on R” for t > 0, with the initial conditions 





(6) u(x,0) = (x), — ur(x, 0) = W(X). 
If u satisfies (5), then 
0? 1 
a2 Mul, r, t) = eee i Ure (x + ré, t) dS(&) 





I 
= / v*Au(x + r&,t) dS(€) 
On-1 J\é\|=1 
= y7AM,,(x, r,t), 
and (4) implies the Luler-Porsson-Darboux equation 


0? 0? n—10 
(7) a2 a5 Mu (x, r,t) = v* (= + : sy) Mules), 





while the initial conditions (6) give 


(8) M,(x,r,0) = Mg(x,r), 





= My(x,r). 


Huygens’ Principle 905 


If we can solve (7) and (8) to find M,(x,7,¢), then by (1), we will have 


Ux.) = im M,(x,1r,t). 


(c) The wave equation in dimension 3. When n = 3, a little calculation shows 
that equation (7) works out to be equivalent to 
of oP 
(9) aa Mulx, iy = v aru (x, r,t). 
‘This means that for cach x, the function 
U* (r,t) =rM,(x,r, t) 


as a solution of the 1-dimensional wave equation for r,t > 0: 
0? x 2 0? x 
aya (r,t) =v aa U (r,t), 


while (8) shows that we have 


U*(r,0) = rMg(x,r) = ®*(r), _ say, 
U*,(r,0) = rMy(x,r) = W*(r), say, 


and equation (2) gives 


U*(0,t) = lim rMu (x, 1, t) = 0-u(x,t) = 0. 
r> 


Since ®*(0) = 0 = W*(0), Problem 8-5 (b) shows that if we extend ®* and W~* 
to be odd functions of r, we can use d’Alembert’s formula to obtain 


O*(r + vt) + O (r —vt a | [~ 


U*(r.t) = — W* (0) do. 
(r,t) 5 ros (p) dp 


r—vt 


Since ®* and W~* are odd functions, for r < vt we have 


r+uvt vit+r 
®*(r — vt) = —@* (vi —n), / W*(p) dp = / W* (p) dp, 


—vt vt-—r 


and thus 


1 O* (vt — O (vt — L fee 
M,(x,7r,t) = ~U* (r,t) = PCED SS rT) ce / 


—— w* d 
2r 2ur (p) ap 


vi-r 
_ ttn) Mg (x, vt +r)~ (tr) Me(x, vt—r) I ia 
= | 


M,,(x, p) dp. 
> pMy (x, p) dp 


al 8 


506 Chapter 15. Addendum 15B 


Letting r — 0 we obtain 





u(x,t) = Mee) +tMy(x, vt) 
Ot t=ul 
0 
= a, EMols, vt)) + tM,(x, vt), 

leading to Aarchhoff’s formula 

1 a 
10 Sat t&)ds 
10) wod= ga (Hf oe +uase)) 


[ 
+ ie i w(x + vté&) dS(é). 


From Kuirchhoff’s formula, we obtain the following picture for the domain 
of dependence and range of influence in the case of the 3-dimensional wave 


range of 
influence 
(surface of 
a cone) 







|X» —Xg|/v 





ae of 
dependence 
(a 2-sphere) 


equation, in contrast to the picture on page 502 for the 1-dimensional wave 
equation. We again have a finite propagation speed, but now we have sharp 
signals. Noise at xo at time 0 1s experienced at x, only at time |x; — xo|/v. 


(d) The wave equation in dimension 2. Although this trick doesn’t work for 
the 2-dimensional wave equation, we can analyze that case by another trick, 
Hadamard’s method of descent, where we view the 2-dimensional wave equation 
as the special case of the 3-dimensional problem where the initial conditions are 
independent of x3. 

Letting £,& be coordinates in R?, the upper half of the unit sphere in R? is 


the graph of f(&1, &) = V1 —&1? — £2, so we can write 


7 dé d& 
aS(E, f(E)) ame. ep (df / 0&1)? ot (Of / 0&2)? dé dé = V1 —f2 —f 


Huygens’ Principle 507 


‘To use (10) we just have to multiply the integral over the upper half of the unit 
sphere by 2, to obtain 


Ae 21 | P(x + ct&,x2 + ct&2) dé, d&2 
An Ot €)2+852<1 ya | = &,2 a £5? 
es 2 | W(x + ct&, x2 + ct&2) d& dé 
Si 82°21 eae cee 


An 
From this formula we now get the following picture for the domain of depen- 
dence and range of influence in the case of the 2-dimensional wave equation, 


u(x1,X2,f) = 








(x,t) range of 
influence 
(solid cone) 





Fyn of 
epenaence 
ioe) (a 2-disc) 

where, once again, a brief disturbance wul continue to be felt after it first reaches 
another point. A physical example is given by water waves. Dropping a pebble 


into a large bowl of water produces expanding circular waves that continue to 
be felt after the first time they reach another point. 


4 
é A 


I 


It turns out (McCowen [1; §3.2e]) that this pattern persists for all n > 1: 
we have sharp signals for odd n but not for even n. ‘This fact 1s sometimes 
referred to as Huygens’ Principle, which would seem to be an overly generous 
attribution. Sometimes the use of the term Huygens’ Principle is meant to bring 
attention to the domain of dependence, rather than the range of influence. 
For example, in three dimensions, the fact that the domain of dependence 1s 
the surface of a sphere, rather than the 3-disk it bounds, might be regarded as 
related to Huygens’ construction of a new wave front by taking the envelope 
of secondary waves along the old wave front, although the formulas deduced 
so far say nothing at all about envelopes (and it isn’t quite clear from Huygens’ 
descriptions whether he would have regarded the old wave front as the entire 
domain of dependence). In fact, in connection with this interpretation one will 
sometimes find the statement that although Huygens’ Principle is true in three 
and higher odd dimensions, it is false in even dimensions, as if Huygens had 
pronounced a principle that happened to be correct by a lucky accident. 


On the other hand, there is a theorem, true in all dimensions, that really 
does formalize Huygens’ construction, which we will briefly discuss later, in 


Addendum I8B. 


PART IV 


HAMILTONIAN 
MECHANICS 


FROM ARAGONITE 
TO THE SCHRODINGER 
WAVE EQUATION 


CHAPTER 16 
THE COTANGENT BUNDLE 


fter Lagrangian mechanics, which takes place on the tangent bundle TM 
of a manifold, we move to Hamiltonian mechanics, which occurs on the 
cotangent bundle 7*M, where each tangent space Mz, 1s replaced by its dual 
space M,*. For convenience, we use the same map 7 to indicate the projection 
ma: T*M — M as we used for TM. We generally follow the notation of DG, 
with some special additions for mechanics. 
For a coordinate system (q!,..., g”) on U C M, the bases 0/0q’ for the fibres 
of 7M over U determine dual bases for the fibres of T*M over U, which we 
will denote by p1,..., Pn, so that for A € mx '(U) C T*M we have 


(a) pi(A) = A(8/dq’). 
On T*M we then have the coordinate system (q! 0 ,...,q" 0 7, P1,.--+ Pn); 
which for the moment will be scrupulously denoted by (q',...,4", P1,-.-+ Pn). 


Special features of the cotangent bundle. For each coordinate system (q, p) = 
(g',...,@", pi,..., Pn) on T*M we can write down the 1-form 


n 
6 = Y- Pi dq’. 
r=] 


This 1-form @ will be very important in the sequel, and it turns out to be 
independent of the coordinate system (g, p). This can be checked by a direct, 
slightly confusing, calculation in coordinates, but we can simply note that the 
definition of the p; in equation (a) can be written as 


A= D> pila) dq! (x(a), 


i=] 
and thus for a tangent vector X on T*M at the “point” A € T*M, we have 
nA n 
A(X) = )- pi(A)(teX) dq! (2X) = D> p(X) dq (X) = 0(X), 

i=1 i=] 

so that we have the invariant definition 
0(X) = 1(X) (aX). 

If this also seems slightly confusing, one can just check that this invariantly 


51] 


912 Chapter 16 


n . 
defined 1-form @ is equal to 5° p; dq’ by noting that 
i=1 


0 
, since Wx (5) = (0); 
A Opi 


eae ( 0 0 
» SINCE Tx | = — es 
2 0g! dq! 


Having settled this matter, requiring a careful distinction between gq’ and q’, 


they both give 0 for X = 





oO 
Opi 


they both give p;(A) for X = 





j 
agi 


we henceforth generally ignore the distinction and write 0 = )~"_, pj dq'. 
Using @, we now define the even more important 2-form @ by 


@ = d= S- dpi A dq’. 
i=1 
It is easily checked that 


0 0 
A; + B;—, C ——— 
o( Dain, Fagin op; doy k gk + Dex) 


0 
7 Sonits A, ) 
Pp Mi ‘ 2 ign? cre 


It follows, in particular, that @ is nondegenerate: for X on T7* M, if w(X,Y) =0 
for all Y, then X = 0. For the interior product or contraction X _| » defined by 


X lw (Y)=@(X,Y) 


we can say that X +» X 1@ is one-one from the tangent bundle T7(T*M) of 
T*M to its cotangent bundle T*(T*M). ‘The notation iy@ is often used for 
X _}@, but we will stick with the “hook” notation. 

Note that since w» = d@, we have dw = 0, which also follows directly from 
the expression @ = )-"_, dpi A dq’. 


WARNING. Sometimes @ is given the opposite sign, @ = )“7_, dq! Adp;, which 
affects multitudes of additional formulas later on. 


It is not hard to see that the n-fold wedge product @ A---Aq@ can be written 
as 
®A++-A@ = (-1)Xnldq! A---A dq" A dpy A--- A dpn, 


for some N (the precise value is N = n(n + 1)/2, but who’s counting?). This 
is non-zero everywhere, so we can use (1/n!)(—1)%@ A --- A@ as a volume 
element on T*M, which reduces to the standard volume on R?” when M = R”; 
in particular, T7*M is always orientable. 


The Cotangent Bundle 513 


The Legendre transform. A mathematical device now quite important for me- 
chanics appeared in the paper Legendre [1] of 1787, in connection with the 
partial differential equation for minimal surfaces. Many differential geometry 
books can be consulted for proofs that (1) the graph of f: R? > R will be a 
critical point for the area function if and only if it has mean curvature H = 0, 
and (2) this is equivalent to the equation 


(1 + Veer = 2 fx fy fry ae (ler Lo) L5 = 0, 


where subscripts denote partial derivatives.! 


Legendre introduced a way of transforming a partial differential equation of 
this sort into a simpler one by using the partial derivatives of the function f as 
the new variables. Though often stated as a formal relationship, the definition 
of the Legendre transformation can most easily be understood geometrically. 
We start with the simplest case of a function f: R > R. 

Suppose f” > 0 on an interval J C R, so that f is convex on this interval, 
with monotonically increasing derivative f’ having values in the interval f’(/). 
Any number p € f’(J) is then the derivative of f at some unique point in 7, 
so there is a unique tangent line to the graph of f with slope p. We let g(p) 
be the negative of the y-intercept of this tangent line. We thus obtain a new 






y-intercept x 


g(p) <0 





function g: f’(J) > R, the Legendre transform of f, which we will sometimes 
write as £f or L(f) [there is no even remotely standard notation for g]. 

Since the x-coordinate of the point of tangency is (f’)~!(p), and thus the 
y-coordinate is f (( ye ‘1 (p)), the tangent line is the graph of the equation 
y—f(L)'O)) = p-[x-(f)"'(p)]. The y-intercept is the value of y for 
x = 0, and the negative, g(p), 1s 


(a) g(p) = p-(f') (pw) —f(f)'©&)). 


This is often stated, for computational purposes, and for those allergic to inverse 
functions, as 

ae for the (unique) x € [ 

a = px — f(x x 
(2’) ae eae ee 
' DG is probably one of the worst references; (2) occurs in Vol. 3, pg. 137, and (1) in 
Vol. 4, pg. 262! Sic semper “Comprehensive”. 


514 Chapter 16 


Since f” > 0 implies that (f’)~! is differentiable everywhere, we can differ- 
entiate (a) to obtain 


g (P= (SF) (p) + PAS) @) — FFI) (F071) (P), 
with the second and third terms canceling, so that 
(b) gp) = (7) (): 


It follows that g” > 0, so we can consider the Legendre transform h of g. And, 
as (b) might suggest, the Legendre transformation is involutive, £(L(f)) = f: 


h(x) = x-(g')* (x) — g((g’) *(@)) 


= xf"(x)— g(f"(x) by (b) 
= xf" (x) — f£'O) FITS) + (FITS) by (a) 
= f(x). 


A more revealing argument can be given using an equivalent geometric def- 
inition (the original form, essentially due to Fenchel [l]). Note that g(p) can 
also be described as the maximum vertical distance of the graph of f from the 
line y = px, that is, as the maximum of the function 


y = px 






p(x) = px — f(x). length g(p) 


a 
a 


length g(p) ———>|~ 





Naturally this maximum is at the point x where 0 = $’(x), 1e., f’(x) = p, so 
g(p) = o(x) = px — f(x), which is just equation (a’). 

Now if h(x) is the Legendre transform of g(x), then h(x) is the maximum 
value of xp — g(p) for all slopes p of tangent lines to f. But for the tangent 
line of f with slope p, the y-intercept 1s —g(p), by the first definition of g, so 
the number —g(p) + xp is the y-coordinate of the intersection of that tangent 


slope p 
—g(p) + xp 
—g(p) 





(x, 0) 


line with the vertical line through (x, 0). But the tangent lines all lie below the 
graph of f, so all values of xp — g(p) are < f(x), and in fact the maximum 


value, f(x), is obtained for p = f’(x). Q.E.D. 


The Cotangent Bundle 515 


The definition of the Legendre transform can be generalized to the case of 
a function f : U — R for U C R”. Instead of f’ we now have the derivative 
Df = (D1 f,...,Dnf) where the D; denote the partial derivatives 0/0x;, and 
now Df: U — R”. We want to assume that Df is one-one, with a differen- 
tiable inverse, which is always the case if the Jacobian matrix (07 f/0x;0x;) is 
everywhere positive definite (Problem 5). ‘The Legendre transform g of f is 
defined on the set of p = (p1,..., Pn) for which we have pj = 0f/0x;(x) for 
some unique x € U, with g(p) defined as the negative of the y-intercept of the 
tangent n-plane to the graph of f at x. 


The formula analogous to (a) is 


(A) g(p) =(p, (Df) "(p)) — F((Df)'(p)) 
=o pi (PS) (yi — F((PP)'(D)), 


i=] 


probably most easily seen by using the alternate description of g(p) as the 
maximum vertical distance of the graph of f from the n-plane with equations 
yj = pix: analogous to (a’) we have 


for the (unique) x € U 


(A) g(p)= » pj x; — f(x) with Df (x) = p,i.e., such that 
xX is a critical point of 5° pjx; — f(x). 


From (A) we easily derive the analogue of (b), 
(B) Dg(p) = (Df) "(p) 


and then prove in a straightforward way that L£ 1s involutive. 

The classical definitions of the Legendre transformation were not geometric 
in nature, and were basically presented as manipulations for writing an expres- 
sion in one set of variables as a new expression in terms of some other set. To 
see the use to which Legendre put this transformation, and also deal with some 
more standard classical notation, suppose we have a function that we write as 
T(x, y), satisfying the minimal surface equation 


(1+ eee —2fx fy fey + + alee = 0. 


‘To simplify this nonlinear equation, we want to get rid of the f; and fy terms, 
which 1s just what the Legendre transform, which we write as g(&, 7), will ac- 
complish, since the condition on x in (A’) will now be written as 


f= fxr, n= fy. 


516 Chapter 16 


Since the Legendre transformation 1s involutive, we also have 
X= 8é; y = & 7° 


Of course, we are leaving out the arguments for the functions, in the classical 
manner, trusting that everything will fall into place on its own. 
Differentiating § = f, we obtain 


= Ofx = Ofx Ox Ofx Oy = 
Fe = Gee + pe ge = Penta + fevtin 


together with three other equations. All four can be written together as 


(is yee &)-(69) 
fey fry Sén S8&& O 1p’ 


from which we obtain 


Ixx =D "nn 
Ty = —D "Sné for D = Tex fyy oe Ne 
fyy = D> Bee, 


When we apply these general equations to the minimal surface equation we 
immediately obtain 


(ice £7) gee + 2Engin + A + nN’ )&nn = 0, 


a linear second order equation. Addendum A gives another example, involving 
first order equations. 

We now want to consider the Legendre transform of a function f: V — R 
for a real n-dimensional vector space V (or an open subset of V). Recall that 
for each v € V we have the derivative Df(v) € V*. Invariantly defined, Df(v) 
is just the best linear approximation to f(v+h)— f(v) ath = 0,1.., the A € V* 
for which 

If(u+hy—f)-AM)| =ollh|) hev 


for any norm | | on V. This map Df is exactly the same as the Df on page 515 
when we use a basis 1,..., Un, of V to identify V with R”, and the dual basis 
Vi*,...,Un™ to identify V* with R”. 

If Df: V — V* is one-one with diflerentiable inverse under one, and hence 
any, such pair of identifications, we can define the Legendre transform g = 
&£(f): V* > R of f by the exact analogue of (a), 


g(a) = A((Df)*(A)) — F((DF)*)). 


The Cotangent Bundle 517 


For computations, we simply use all the same formulas as before, except that we 
work with a basis {v;} of V and the dual basis {v;*} of V*. The Legendre trans- 
formation is again involutive: the maps f: V > R and £(L£(f)): V** — R 
are equal when we identify V** with V by the natural isomorphism. 

The fact that we are interested in the Legendre transformation for f: V > R 
is, of course, a dead giveaway that we want to apply it to manifolds. 


Given a Lagrangian L: TM — R, at each point p € M the restriction 
Lp» = L|Mp, is a smooth function taking the vector space My to R. We can 
therefore consider the derivative 


D(Lp): Mp > Mp’. 


By putting together all the maps D(L,), we obtain a map from TM to T*M 
called the “fibre derivative” of L, though the “fibre-wise derivative” of L might 
be a better name. For this map we will adopt the notation 


FDL: TM > T*M = |J D(La): Mp > Mp". 
REM 


In terms of a coordinate system (q,g) = (q',...,q",q',...q”) for TM and 
the corresponding coordinate system (q, p) for T*M, the map FDL is given by 
(q,q) +> (q,0L/0q). More precisely, we can write 


aL 
agi 





pi oOFDL = 


We call L a regular Lagrangian if each D(Lz) 1s one-one with a differentiable 
inverse, in which case can write the above equation for p; o FDL as 


OL 
Pc= 3 © (FDL). 
0g! 


Note that L is always regular in the case of mechanics problems, where L = 
T — V for the positive definite kinetic energy T. 

For a regular Lagrangian L we can also apply the Legendre transformation 
to each Ly, to obtain maps £(Lzp): M,p* — R, and putting these together we 
get the “fibre-wise Legendre transform” of L, 


H =F&(L): T*M > R= |) £(Lp): Mp* > R. 
REM 


All these considerations are easily extended to the case where we have a 
Lagrangian L: TM x R — R, obtaining H: T*M x R — R. Basically, we 
simply work on each TM x {t} separately, and then put the results together. 


518 Chapter 16 


ADDENDUM I6A 
THE CLAIRAUT EQUATION 


The Clatraut equation provides an interesting elementary application of the 
Legendre transformation, and also serves as a simple illustration of the envelope 
of a family of solutions of an equation, which will play an important role in the 
optional sections near the beginning of Chapter 18. In dimension 1, a Clairaut 
equation is one of the form 


u(x) = xu'(x) — f(u'(x)). 
Differentiation leads immediately to 
0= [x — f’(u'(x))] -u" (x), 
and thus “in general” to two different equations: 
(1) u(x) =0 
(2) x — fi(u'(x)) = 0. 
For (1) we must have u(x) = c for some constant c, and indeed 
u(x) =cx — f(c) 


is always a solution. We thus have a 1-parameter family of linear solutions (a), 





(a) 


and the envelope (b) of this 1-parameter family of solutions must also be a 
solution. ‘The standard way of obtaining the envelope (see Addendum 8B) is to 
“differentiate the equation u(x) = cx — f(c) with respect to c”, 


O=x-f'(c) = c=(f)'Q), 
and then substitute back into the equation u(x) = cx — f(c), to get 
u(x) = x(f’) (x) — F((F')*()), 


so the envelope is £( f), which is also just what we get by solving (2) for u’(x) and 
substituting back into the equation. ‘This whole discussion assumes that we are 


The Clairaut Equation 519 


considering an interval on which which we can define £(f), which is equivalent 
to the condition that the corresponding straight lines have an envelope. (We can 
easily obtain more complicated solutions by piecing together an arc of L£(f) 
with portions of the straight line solutions at either end, but for simplicity we 
stick to the general case.) 

Another approach to Clairaut’s equation is to apply the Legendre transfor- 
mation directly to u, to obtain information about its transform v = £(u). To 
minimize notational complexity, and maximize confusion, we use the classical 
notation on page 515. We have 


v(€) = £x — u(x) with Ux =e 
= XUx — [xUx — f(ux)] 
= f(&), 


1.e., the Legendre transform of the solution is f, so by involutivity, the solution 
is £(f). In this case, we miss all the straight line solutions, which don’t have 
Legendre transforms. 

In a similar way, for the 2-dimensional Clairaut equation 


U=XUx + yuy — f(Ux, Uy), 
we find, using the notation on page 515, that v = L(u) satisfies 


u(§,n) = &x + ny —u(x, y) 
= XUx + yuy — [xux + yuy — f(ux, Uy)| 
= f (En), 


so that one solution is £(f). 

The missing solutions are obtained by differentiating the equation, as before, 
now separately with respect to x and y. Letting f; = Di f and fo = Daf, we 
obtain 


(x + fi)uxx + (vy + fo)uxy =0 
(x+ fi\lxy + (y + fa)uyy = 0. 


If uxxtyy — 7 #~ 0, then we have x = —f; and y = — fo, which gives the 
solution £(f) already obtained. 

On the other hand, when uxxUyy — Uxy” is identically 0, we get the devel- 
opable surfaces, or in the general case, the tangent planes to the graph of £(/f). 


520 Chapter 16 


PROBLEMS 


1. (a) Licf)(p) =cL£f(p/c). (Definition (a’) helps preserve sanity.) 
(b) If f(x) = x, then Lf (p) = Fp?. 
(c) If f(x) = mx’, then £Lf(p) = p?/2m. 


2. (a) For any continuous f on an interval J, the geometric definition of £(f) 
on page 514 makes sense whenever A = {(x, y): x € 1,y = f(x)} is convex. 





slope P1 


ES GLE CEREBRAL ESE 





(a) (b) 


Show that a point where the left- and right-hand derivatives of f are unequal 
gives rise to a line segment in the graph of £(/f). 
(b) Conversely, what happens if the graph of f contains a straight line segment? 


A 


We can summarize by saying that if f 1s a convex polygon, then £(f) is also, 
with vertices of f corresponding to edges of £(f) and edges of f correspond- 
ing to vertices of L(f). 


3. (a) If g = £(/), then we have mx — f(x) < g(m), for all x and m, and thus 
mx < f(x) + g(m). 
(b) If f(x) = x/a, then £(f) = p?/b, where ‘ + i = 1. Hence 


ied 
b 


xt 
MS 
a 


for all x,m >Oand + + i = | (Young’s inequality). 


a 


The Cotangent Bundle 521 
4. If f is a quadratic function, 
f(x) = > ij XiX;, 
i,j=l 


show that 


L(f)(Di f(x), . ee) Dn f (x)) = f(x). 


5. For two points a,b € U C R”, consider 


d 
7, PL + t(b—a)), 
1.e., the vector 


& af 


d of 
Tag G tie Fz |@t1O -a))] 3 


Show that d 
=, Dia + t(b —a)) = (6 —a)- J(b—a), 


where J is the Jacobian matrix (0° f/dx;0x;). 


(a) Conclude that 


] 
Df (b) — Df(a) = | (b—a)-J(b—a), 


and using the hypothesis that J is always positive definite, show that if b—a # 0, 
then Df(b) — Df(a) 4 0. 


CHAPTER 17 


THE INTERPLAY OF 
MECHANICS AND OPTICS 


illiam Rowan Hamilton (1805-1865) presented his first paper, “Theory 

of Systems of Rays”, to the Royal Irish Academy in 1824, and on the 
basis of it and its sequels he was appointed to the Professorship of Astronomy 
at the University of Dublin in 1827. Three Supplements followed the paper, 
the whole kit and kaboodle taking up pages 1-293 of Hamilton [1]; luckily, we 
are not studying optics, so we need only consider one or two things near the 
beginning and the end of from these papers! 


Optics emulates mechanics. In the Introduction to his first paper, Hamilton 
noted that previous investigations of geometrical optics had usually been con- 
fined to special cases, with very few general theoretical results. ‘Though Malus’ 
theorem was of a more general nature, Hamilton noted the error in it, and 
considered it to be too specialized. 

In addition, Hamilton wanted to give optics, still buffeted by the battle be- 
tween wave and corpuscular theories, as secure a foundation as mechanics, by 
basing it on a principle that was “independent of any hypothesis about the na- 
ture or the velocity of light”. Maupertuis’ principle of least action still held that 
favored position for mechanics, and Hamilton chose Maupertuis’ principle of 
least action for light as the basis for optics, avoiding questions of the velocity of 
light by reference to the refractive index, as mentioned on page 497. We now 
know that Hamilton’s choice was in fact Fermat’s principle and if we choose 
units so that the speed of hght in a vacuum 1s 1, the refractive index for any 
medium is 1/v for the speed of light v in that medium, and Fermat’s principle 
states that the path of a light ray is a critical point for {(1/v) ds along the path, 
that is, for the time taken to traverse the path. 

Note that if w; and wz are vectors along the directions of the incident ray and 
refracted ray whose lengths are 1/v, and 1/v2 for the two media, the horizontal 
components of these vectors have lengths sina/v; and sin B/v2, and thus the 





The Interplay of Mechanics and Optics 923 


sine law sin a/ sin B = v;/v2 says these horizontal components are equal, so that 
W2 — W; 1s perpendicular to the separating plane, 


a criterion that can be applied to the tangent plane of a smooth boundary 
surface at any point. 


MALUS’ THEOREM. Suppose now that we have, as in (a) of the figure below, 
a family of light rays emitted from a surface Q, passing through a medium 
with refractive index 1/v until they are refracted by a surface Q which is the 
boundary of a medium with refractive index 1/v. ‘The time required for light 


(x, y,Z) 





to go from a point (x, y,Z) on Q to a point (x, y,z) is called the optical length of 
the light ray. Now consider (b) a vector field ¥ on Q pointing in the direction 
of the light rays, with all vectors having length 1/v, and a vector field v on Q 
pointing in the direction of the refracted rays, with vectors of length 1/v. If a is 
the point where the ray from @ intersects Q, and / is the distance from @ to a, 
while / is the distance from a to some point b along the refracted ray, then the 
optical length © of the ray from @ to b is 1/0 + 1/v. 

For a curve u +> G(u) in the surface Q with corresponding ¥(u), as well as 
corresponding a(u) and v(u), and I(u) and I(u), we have 


(1 a(u) = a(u) +1 (u)d - ¥(w), 
(2) b(u) = a(u) +l(u)v- v(u). 


Since all v(u) have the same length 1/v, differentiating (1) and taking the inner 
product with v(u) gives 


(a'(u), ¥(u)) = (a'(u), ¥(u)) + (u)/d. 


But a’(u) is tangent to Q, so, using the criterion for the direction of the refracted 
ray given at the top of this page, we have 


(V(u) —v(u), a’(u)) = 0, 
and we obtain 


(3) (a’'(u), v(u)) = (a’'(u), ¥(u)) + I'(u)/B. 


924 Chapter 17 


Differentiating (2), taking the inner product with v(u), and using (3) we obtain 


(b’(u), v(u)) = (a’(u), vV(u)) + (uu) /v 
= (a'(u), ¥(u)) + 1'(u)/d + I'(u)/v 
= (a'(u), V(u)) + O'(u). 


Suppose, in particular, that all the rays are emitted perpendicularly from Q, 
so that all (a’(u), ¥(u)) = 0. Then all (b’(u), v(u)) = 0 if and only if @ is 
a constant. So, given a system of rays all perpendicular to Q, we can find a 
surface perpendicular to the refracted rays simply by taking the endpoints of all 
rays with any constant optical length. In this way (or at any rate, by equivalent 
considerations), Hamilton was able to prove the most general version of Malus’ 
Theorem for refraction, without any involved calculations; reflections can also 
be included by choosing v = —vg. 

All these considerations can easily be extended to the case of an “optical 
instrument” where several successive refractions are involved. For a region of 
the final medium where each point (x, y, z) 1s on a unique ray starting from a 
point (x, y,Z) on Q, Hamilton called the optical length of that ray from (x, y, Z) 
to (x, y,Z) the characteristic function V(x, y,z) for this instrument, with 


ic avy vy" av\* 1 

ax ay dz} yp? 

where v is the speed of light in the final medium, and he made V, and related 
functions, the theoretical basis for optics. 


FERMAT’S PRINCIPLE AND HUYGENS’ CONSTRUCTION. Hamulton had mainly 
been concerned with geometrical optics, but by the time he presented the Third 
Supplement, in 1832, the wave theory of Young and Fresnel was pretty well 
established, and Hamilton addressed it near the end. 

Given indicatrices, as on page 495, determining the waves obtained by Huy- 
gens’ construction, we can introduce Fermat’s principle by identifying our light 
rays with the geodesics for this method of measuring lengths—restricting our 
attention to geodesics y for which every tangent vector y’(t) has “length 1”, 
1.e., lies on the indicatrix at y(t). For light emanating from a single point p, 
we can then define the wave front ®,(t) for ¢ > 0 to consist of points y(t) 
where y defined on [0,¢] is a minimal geodesic with y(0) = p (more generally, 
we could consider light rays emanating from a surface, or any closed set). 

When our indicatrices are spheres of varying size so that we are dealing 
with a Riemannian metric ( , ) for which orthogonality is the same as the 


The Interplay of Mechanics and Optics 9295 


Euclidean one, each such geodesic y from p to a point g in the wave front W 
will intersect W orthogonally. This can be proved by a calculation, basically 


q 


W 
P 
the one used to prove “Gauss’ Lemma” (cf. DG, pg. 337), but a geometric 
limiting argument for this intuitively obvious fact can be given by considering 
the extension of y to a point g’ beyond q by a very small amount (a). Then (b) 






tangent plane 
of W at q 





(b) 
the line from q to q’ will be practically the direction of the tangent vector v of y 
at g, and it should also be almost orthogonal to the tangent plane of W at gq, 
since the shortest distance from the point q’ to that plane is the perpendicular. 

For general indicatrices, of which ellipsoids are a special case, a similar argu- 
ment shows that if J/g is the indicatrix at g, then the tangent vector v of y at qg 





<————— tangent plane of W at q 


is the unique vector v in Jg conjugate to the tangent plane of W at q, meaning 
that the tangent plane P of J, at v is parallel to the tangent plane of W at q, so 
that v is the point of the indicatrix furthest from the tangent plane of W at q. 

Knowing a specific relationship, it 1s easy to show that for s > 0 we have 
Huygens’ construction, 





Pp(t + 5) C envelope of {Pa(s) I pe ®,(t)}. 


O,(t + 5) 


In fact, for g € ®p(t +s), let y be a geodesic defined on [0,¢ +s] with y(O) = p 
and y(t +s) = q, and let v be the tangent vector at q, so that v is conjugate 
to the tangent plane of ®p(t + s) at g. Then p = y(t) € ®)(t) and since 
y =y\|{t,t +s] 1s a geodesic of length s from p to g, we have q € ®j(s). But v 
is also the tangent vector of y at q, so it must also be conjugate to the tangent 
plane of ®5(s) at g. It follows that ®5(s) and ®p(¢t +s) are tangent at g. 

In this way, Hamilton linked Fermat’s principle to Huygens’ construction 
for the spread of wavefronts, and conversely, given Huygens’ construction, the 


926 Chapter 17 


integral curves of the conjugate directions give the rays for Fermat’s principle, 
formalizing the notion, by then generally recognized, that the two principles 
were essentially equivalent. 


Note, by the way, that the wave fronts are just the level sets of V; since these 
level sets are far apart when the wave moves fast and close together when they 
move slowly, Hamilton called the gradient 


p = grad V = (s i =) fast ; \ ))) slow 
oN Ne V= 1 2 3 456 
which can also be defined by : 
OVS > pi dx', 

i=1 
the vector of normal slowness of the wave front. The vector p is perpendicular 
to the wave front in the usual inner product on R?, so if the indicatrices are 
all spherical, and thus perpendicularity isn’t changed, we have, using (C) on 
page 524 that the tangent t of a ray is given by the equation t = p/|p| = up. 


CONICAL REFRACTION IN ARAGONITE. The third Supplement ended with a 
demonstration that Hamilton had also mastered the intricacies of Fresnel’s 
theory. He proved that for “bi-axial” crystals, with properties like Iceland spar, 
when a ray hits the crystal at a certain angle (a) the ray will not split into two 
rays, but theoretically split instead into a narrow cone of rays, then emerging as 





(a) internal conical refraction (b) external conical refraction 


a narrow hollow cylinder of rays (internal conical refraction), while at another 
angle (b) a ray of light will not be split at all until it emerges from the crystal in 
a narrow cone (external conical refraction). 

Hamilton asked the physicist Humphrey Lloyd of the University of Dublin to 
test this prediction. After obtaining an extremely pure specimen of the crystal 
aragonite, Lloyd was able to check these results, with data agreeing extremely 
well with Hamilton’s predictions. Fresnel’s theory had by then received so much 
attention that no one expected any startling new theoretical deductions to be 
made, and the discovery of conical refraction suddenly made Hamilton quite 
famous, and something of a scientific hero in Ireland. 


But Hamiulton’s lasting fame today rests on a paper published shortly after the 
third Supplement had been written. 


The Interplay of Mechanics and Optics 927 


Mechanics returns the compliment. Hamilton’s main papers on Optics ap- 
peared in the Transactions of the Royal Insh Academy, but in 1834 two papers 
by Hamilton on the subject of Dynamics were published in the Philosophical 
Transactions of the Royal Society of England (perhaps his conical refraction fame 
had a role in this). These papers eventually turned out to be as important as 
Hamilton obviously expected. In fact, in ways that were unexpected, and in 
ways that no one at the time could have anticipated, they turned out to be more 
important than he could ever have imagined. 

As we Saw in previous sections, in Hamiulton’s treatment of optics, the rays 
were determined by using a variational principle, Maupertuis’ principle of least 
action for light, which we discreetly replaced by Fermat’s principle, involving 
{(/v) ds. Then the “optical length” of a ray, the value of {(1/v) ds along the 
ray, was used to define a characteristic function. 

Hamilton aimed to transfer this treatment of optics back to mechanics, and 
in his first paper he naturally chose Maupertuis’ principle of least action to 
determine the trajectories of a set of particles, and used the value of the action 
integral A along this trajectory to define a characteristic function for mechanics. 

Not surprisingly, in view of the vagaries of Maupertuis’ form of the principle of 
least action, the formulas Hamilton derived had some unpleasant complexities, 
and right at the end of the paper Hamilton noted that it would be simpler if, 
instead of the integral of the action, one used the integral {L = {T —V to get 
an “auxiliary function S”’. 

By the time Hamilton wrote up his second paper, S had been promoted to 
the head of the class, and was now called the principal function. So in this 
paper Hamilton started all over. In particular, the trajectories c of a system of 
particles were now determined by the principle that 
t 


Cc: [t1,t2]} ~ M © should be a critical value for | : Li(c'(t), t) dt. 
t 


l 


And, as in one of Kipling’s Fust So Stones, that is how “a simple and general 
result of the laws of mechanics” noted by Lagrange (page 464) ended up being 
known as Hamulton’s principle. 

For the characteristic function in optics, we had to consider a region where 
each point is on a unique ray starting from some point on our initial surface Q, 
or in the simplest case a unique ray starting from some fixed point; this 1s 
basically the same as saying that we need each point in the region to be on 
a ray with a unique initial direction. The principal function S for mechanics 
requires more involved considerations, because mechanics trajectories can have 
different speeds, so we need to specify more than just an initial direction. As we 
will see, Hamilton was not one to shy away from such involved considerations. 


928 Chapter 17 


Suppose #2 1s some fixed point in M, and we have a particular trajectory co 
from px, defined on an interval [0, fo], with endpoint ~9 € M. Rather than 
simply considering a neighborhood of fo in M, we will consider a neighbor- 
hood of (#0,fo) in M xR: if (g,¢) 1s sufficiently close to (po, to), there will 
be a unique trajectory C(p,r) : [0,t] — M, defined on the interval [0,f], with 
C(p,t)(O) = px and c(p,1)(t) = p. We then define S: M x R > R by 


C(p,t) A= cal) 
pee = co(to) 


The equations on T*M. Hamilton’s first step in studying the principal func- 
tion S was to transfer Lagrange's caquanons on TM to equations on T*M: a 
curve c: R — M gives rise to the curve c’: R — TM, and thus to a curve 


y=FDLoc’:R->T*M, 
y(t) = DLeay (c(t), 


and we want to translate Lagrange’s equations for c into equations for y, which 
turn out to assume a very simple form in terms of the fibre-wise Legendre trans- 
form H of L (Hamilton never actually mentions the Legendre transformation, 
merely introducing its formulas with hardly any explanation). We follow the 
classical computations, with a bit more explicitness. 

A coordinate system g on M gives coordinates (¢,q) on TM and (q, p) on 
T*M. So that we don’t get tangled up in cumbersome notation, we will let the 
inverse of FDL: TM — T*M be denoted simply by ¢ : T*M — TM. Then 
the next to last equation on page 517 can be written 


(1) Pi =z 


For H, we want to apply equation (A) on page 515 for the DL on the various 
Mp, so the coordinates that we apply to the (Df)~! term are now the q', and 
thus we define e 
(2) H=) pi’ o¢)-Log. 

i=] 
(It’s a good idea to consult Problem | for examples of what H looks like). ‘Then 


t 
S(p.t) = | L(c(p,t) (t), T) at. 
0 Qe 


dH = Dn d(q' og) + 3G of) dpi 


n 


0 
-¥(%: | s) ae °) — (ja ee) aed o@) 


1=1 





Equation (1) shows that the first and fourth terms cancel, so that we have 
n 


ne OL 
dH = Gi edap->- ( i‘ 6) d(qio). 
P=] 


1=1 





The Interplay of Mechanics and Optics 929 








". 0H 0H . 
But we always have dH = » ., dpi + agi d(q' of), so we get 
ij 0H OL 0H 
q © p ee Qop - p ae i* 
Opi dq dq 


[For more classical looking computations, simply delete ‘o ¢’ wherever it occurs. ] 


Now suppose that c satisfies Lagrange’s equations, 


tee) = (Saew.e')). 








dq! 
Taking into account equation (1) again, we find that y satisfies 
si. i dH a 0H 
diy) =@ eyv)'t) = ap, YO) Pe q = ou 
(H) 9H ‘condensed ayy 
BY) = (perv O =-= (0) notaton” =| pp =-——, 
Oq 0g 


a system of 2n first order equations, in contrast to Lagrange’s system of n 
second order differential equations. 


For a Lagrangian L: M x R — R depending on f, we just apply this analysis 
to each M x {t} to get equations on T*M x R. It is not hard to see that in this 
case we also have 

0H OL 0H OL 
H’ — = ——(c'(t),t d — =-—. 
(H1") > (y(t), t) y (c (t),t) condensed to , 


Equation (H) implies that, in slightly abbreviated notation, 


d ". [0H ., dH . dH dH 
THY. =| Sod) + Fh] + FOO. = FOO, 


i=] 





so when L (and #) do not depend on f, we have the generalized energy integral 
(E) H(y(t)) is a constant. 


More generally, equations (1) and (2) show that 


=. Ole. 33 
A (y(t),t) = by agi q’ =: L|(c'w).0) 
i=1 


The term in brackets is just the action minus L on page 448, which 1s a constant 
if L and H do not depend on f; usually, of course, it is the energy EF = 7+ V. 





530 Chapter 17 


The partial derivatives of S. For Hamilton, the equations (H) were mainly a 
tool for finding the partial derivatives of the principal function S. For an initial 
point p+» € M, the principal function S may be written in coordinates as 


S(q.t) = SQq'.....g".1) = | L(cqq,t)'(t)) dt 
0 


where Cig,t): [0,f] — M is the solution of Lagrange’s equations satisfying 
C(g,t)(0) = px and cig.) (t) = the point with coordinates g. For notational con- 
venience, we use the abbreviation yg for the curve 


Yq = V(q¢,t) = FDL o C(q,t) in T*M. 


In Hamilton’s calculations of the partial derivatives of S, the Lagrangian L 
was replaced, without any preamble, by the Legendre transform of H, thus 
relying on the fact that the Legendre transformation is involutive, so that we 
have (after a bit of thought) 


t 
3) Sat) = [0 eva(eVa! om) — Ha) de. 
j=l 
To find 0S/dq', we take the derivative 0/dq' inside the integral sign to obtain 


as 
5gh 09) 





i lati 0 
= [PX gales! ova)'e) + Sy oye (4! 9%) (0 
“0H a - 
=o Fal ger (a! ovN(E) — DF val grr ovate) ae 
J=1 


J=1 





Applying equations (H), we see that the first and fourth sums cancel, and we 
can also write the 0H /dq/ in the third sum in terms of p,, to get 


Jar 


5 8) = Drew Ogu -(q/ © ¥q)'(t) 
at 
cS oy 
j=l 


= [ Lo " ake e ya)(t)) da 





tT=1 


= Dems -(q/ © ¥q)(t) 


jJ=1 





tT=0 


The Interplay of Mechanics and Optics 531 


Since Cig,r)(t) has coordinates g, the same is true of yg(t), so that 





P.. 3 | 
5gr A! © YaN(t) = 8}, 
and the upper limit term, for t = f, 1s just pj (yq(t)). Moreover, all yg (0) = px, 
so that the lower limit term, for t = 0, vanishes. Thus, we have found that 


(4 a 

q 
For the computationally challenged, a startlingly short altcrnative proof 1s given 
in Problem 2. 

These results mean that if we have a formula for S, then we know the 
Pi(Yq(t)), which in the case of mechanics problems with L = T — V means 
that we know the g‘(c(t)), and thus we can solve for c in terms of integrals. 
“The difficulty of mathematical dynamics is therefore reduced to the search and 
study of this one function S, which may for that reason be called the PRINCIPAL 
FUNCTION of motion of a system.” 





(g,t) = pilYq(t)). 


A partial differential equation for S. For the study of S, we will first find a 
formula for 0S/dt. Note that equation (3) immediately gives, for fixed q, 


d ‘ ; 
77 50) = Y (pie Yq)(t)(q! © ¥q)'(t) — H(yq(t)). 


f=] 
But also 
d as “. 0S , 
Wo) = 5 G0) p> jg PG! eva). 
Therefore 
as - 
(5) a (Gt) = D (Pio Ya O(G! © ¥a)'(C) — Ha) 
j=l 


“. os ) 
— agr 4! © 14) 
j=l 


=—-H(yq(t)) using (4). 
‘Together, (4) and (5) give us the equations 


0s 0s 
agi (41) = Pi alt), t) Brin agi 2) = Pi 
(6) condensed 


0s as 
5, Gt) = A (valt),1) noranen {a -@t) = -A@, Pt), 


532 Chapter 17 


‘These equations can be combined to give a single first order partial differen- 
tial equation for the function S on M x R, 


a, wear a 
y eS | gl? agn’ = VU. 


In Hamilton’s paper, this equation js arrived at in short order, and the remain- 
ing forty or so pages are devoted to explaining how to get increasingly better 
approximations to the solution S of the equation, and thus to the equations of 
the system for which S was constructed. A prototypical use of such a process 
would be to obtain approximate solutions to the three body problem, starting 
with the solution of the two body problem, especially if the third body were 
small, like a moon. 

On the other hand, while Hamilton treated the equations (H) simply as a 
means to obtain the equation for S, these equations became a central object 
of study in the work of Jacobi, to be discussed in the next chapter, imbuing 
Hamilton’s results with an entirely different flavor, which forms the basis for 
what we now call “Hamiltonian mechanics”. In fact, nowadays equations (H) 
are usually presented as the basis of Hamiltonian mechanics, with the princi- 
pal function S usually entering the picture only as a sort of afterthought. So, 
although Hamilton’s viewpoint was the inspiration for a whole new way of look- 
ing at mechanics, in the following chapters we will also want to focus attention 
on the equations (H) themselves. 


For the present however, we simply want to access our current situation, and 
prepare for the future. 


Invariant definitions; the interplay of 7M and T*M. So far we have only given 
somewhat indirect proofs of the invariance of Lagrange’s equation. Given a 
Lagrangian L: TM — R, we have not defined a vector field Xz on TM such 
that the (first order) equations for the integral curves of Xz will be equivalent 
to Lagrange’s (second order) equations. ‘This calls for a few words about second 
order ODE’s on M. 

For a coordinate system gq on M, and corresponding (q,q) on TM, consider 
the second order system of ODE’s 


d*q' 





with the arguments omitted, as usual, and the case of explicit dependence on f 
omitted for simplicity. We can write this as a system of 2 first order ODE?’s, 
dq’ gj dv' 


= Fi (q, v), 
(b) re i (q, v) 





Lhe Interplay of Mechanics and Optics 933 


which can be thought of as a first order ODE for a vector field X on TM, 
(c) X(v) = >» sai| + > Fi(q,v) ail for all v € Mg. 
1=1 


We clearly have 
(II) 1x(X(g,v)) = v for all v €e TM 


and we call any X on TM satisfying (II) a second order ODE on M, since it is 
then of the form (c), equivalent to (b), and thus to (a). 

The question then becomes: can we define, directly in terms of L, a vector 
field Xz; on TM, which is a second order ODE on M giving Lagrange’s equa- 
tions for L? In general, we cannot, because Lagrange’s equations are not in 
the standard form (a), with second derivatives given explicitly. It is only in the 
case of a regular Lagrangian that they can be written in the form (a), so we only 
expect to find X, for a regular Lagrangian L. 

Taking this as a hint, we shift our attention to the equations on T*M into 
which Lagrange’s equations are taken by FDL, 

.; _ 0H _ OH 
(H) q = Opi Pie= igt 
These equations can simply be regarded as a set of first order differential equa- 
tions for a vector field on 7*M, 


“.[(dH\ 9 dH\ 9 
XH = - — -}— |]. 
. Yl ar) ae Gare 


Everything has still been written in terms of a coordinate system, but now we 
can bring in the invariantly defined 2-form w = >°, dp; \dq' on T*M. We 


easily check oat 
H H 0H 0H 
w| XH, Aj )= (-3. 5 — 45) 
( = + 2 Bi; x ODi dq’ 











55 


‘ 0 
=-dH (Sag + DBs), 
J=1 : aqt J J= : Op; 
or, using the notation introduced on page 512, 


Since X +> X _| @ 1s one-one, this determines Xy uniquely, giving an invariant 
definition of the vector field Xq. This makes sense for any H: T*M — R, 
and henceforth a “Hamiltonian H on 7*M” just means any smooth function 
H: T*M — R (with the suggestion that we will be interested in Xj). 


534 Chapter 17 


Aside from the elegance of an invariant definition on T7*M, we emphasize 
that on 7M we are constrained to consider second order ODE’s, which means 
we can only allow curves of the form (q(t), qg(t)); more general curves of the 
form (Q}(t),...,Q7"(t)) for an arbitrary coordinate system (Q!,...,Q?”) on 
the 2n-dimensional manifold TM never enter the picture. On the other hand, 
on T*M, where there is no particular a prion relationship between the p and q 
coordinates, the equations of a trajectory may involve more general sets of co- 
ordinates (Q',...,O0", Pi,..., Pn), with significant consequences, as we shall 
see in Chapter 19. 


The extended Hamilton’s principle. A result that will play an important role in 
Chapter 19 gives an interpretation of the equations (H) as a condition for ex- 
tremals. Since equations (H) are basically the Legendre transform of Lagrange’s 
equations, which are equivalent to the Euler equations for the Lagrangian, it 
is not unreasonable to expect that they can be derived directly from the Euler 
equations for the Hamiltonian H = F£(L). However, there is a bit of a surprise 
in store. 

For two fixed points 20, 21 € M, consider a curve y in T*M, defined on the 
interval [to,t:], for which z(y(to)) = go and m(y(t1)) = #1, where a is the 
projection 7: T*M — M. In other words, the “base curve” c = mo y of y 
goes from fo to 1. Consider the integral 


J = [| dq! — H |y'@ya 
to Lay 


=| Srwo'o'e) - Hy'O.1at. 


0 j= 


Since L is the Legendre transform of H, this 1s essentially the same as 


ty 
| Lic’ (t), t) dt, 
LO 
so we might expect that the critical values of J satisfy the equations (H). Note 
that this is not directly equivalent to Hamilton’s principle, since we are now 
considering arbitrary curves y in T*M, not only those of the form c’(t); in other 
words, as it is usually expressed, we are allowing p and gq to vary independently. 


PROPOSITION (EXTENDED HAMILTON’S PRINCIPLE). The critical 
values of J are precisely the curves y: [fo,t1] i T7*M whose base curves c = 
moy go from £9 to p1 and which satisfy the equations (H). 


The Interplay of Mechanics and Optics 535 


PROOF. Consider a variation a(u,t) of c, and let 


ty n ; r) 
J(u) =f baz dq! — Hl (Fw) d 
to j=l 


We leave it to the reader to check that the following computations make sense 
when the proper arguments of functions are inserted. We start with 


— res 4 gi ao 0H dq! OH Op; 


+ Pia Ogi Ou Op; Ou 





0 j=] 
Now 0g! /du = d/dt(dq' /du), and using integration by parts we have 


mal 





af OE ee ie 
‘du Ogi Ou —s Ap;-: Ou 


Opi Y 0H dq! OH 
-[ >|? (4-5) He (+ aa) | 


It follows that the quantities in parentheses must be 0, since we can vary the p; 
and qg' independently. ¢ 





The paradoxical fact that the critical paths for all variations, which we need 
to look at for the extended Hamiulton’s principle, can be found by examining 
only the critical paths satisfying pj = 0L/0q', the ones we need to look at 
for Hamilton’s principle, is explained by a fact that arises from the geometric 
definition of the Legendre function: applying equation (A’) on page 515, and 
using the fact that the Legendre transformation is involutive, we see that for 
given (q',...,g”), the point 


(Pis--- Pa) = ( 


is already an extremal for 





aL ab 
3 


n 
> pig' —H 
i=] 


536 Chapter 17 


ADDENDUM 17A 


LIOUVILLE’S 
VOLUME THEOREM 


There are two important theorems by Liouville concerning Hamiltonian me- 
chanics. Here we present the simpler of the two, Liouville’s volume theorem. 
A slick proof will appear as a fleeting by-product of the material in Chapter 19 
(page 576 ), but Liouville actually proved a more general result than the one 
obtained there, and it might be interesting to examine arguments similar to the 
original ones (a predilection of the author that will intrude itself, once again, in 
Chapter 21). We begin with a lemma due to Liouville, and an old-fashioned 
proof. 


LEMMA (LIOUVILLE). For an n-parameter family of maps F;: R” > R”, 
given by 

(*) F,(@1,...,@n) = (x1(@1,-..,4n,t),...,%n(@1,.-.,@n,0)), 

consider the n-parameter family of vector fields X given by 


OX; 





Xi(t)(a1,...,4n) = X;'(Q1,..-,4n,t), Le, = a (a1, --+5nst). 
Set 
O(X1,.--,Xn 
J(t) = det M(t) = det OCeisasnadin) 
O(a1,...,4n) 
and let A be aX, ay 
~ Ox, || OXn. 
Then 9] 
—=JA. 
Ot 


PROOF 1. The partial derivative 0/J/dt is the sum of the determinants of n 
matrices, of which the first, for example, 1s 


0(X1,X2,..-,Xn) 
O(a1,...,@n) 


For the case n = 2, we have the matrix 


oa eet GZS dis a 
0a dar OX 0a 0x2 0a OX4 0x2 0X2 dar 
OX. Oxo 7 0x2 0x2 


da; daz day daz 


Liouville’s Volume Theorem 537 


The determinant is not changed if we subtract 0X;/0x2 times the second row 
from the first row, which gives us the matrix 

OX] OX4 OX, 0X4 

OX; 0a, 0X, Oar 


eee 
0a, dar 


whose determinant is 0X; /0x, times J, and similarly the determinant of the sec- 
ond matrix in the expansion of dJ dt is 0X2/0x2 times J. The whole argument 
clearly extends to general n. % 


PROOF 2. We only need to prove this for ¢ close to any given fo, so we can 
assume that M is close to the identity, and thus that log M is well-defined. Then 
we have 
Padaw = etrace(log M) 
(this is clearly true for diagonalizable complex M, which are dense, hence it is 
true for all M). 
Therefore, 


fa | ei 


; , 
da’ 0x,’ 
=—J es 1M; = J dX ae a 








ij=l1 
da’ 0x;’ Ox, da’ ax, Ox; 
J ee ee: eneieeackitl aee 
ap Oxs Ox, OQ; ‘= y Ox/ da; Ox, 
- Ox ;’ Ox; 
— J » Ojk a J Ds . %& 
jK=1 J 


COROLLARY. For a bounded open set U C R”, let U; = F,(U), and let 


A(t) be the n-dimensional volume of U;. ‘Then 
A'(t) =| A dx,...dXn. 
U; 


PROOF. By the change of variable formula for multiple mtegrals, we have 


Aw = | dy... dn = | I -dQ¢ 601 GK: 
U; Uo 


oJ 
A'(t) =| —daj,... day -| AJ da... dap =| Adx,... dxn. % 
Uo Or Uo U; 


SO 


538 Chapter 17. Addendum 17A 


COROLLARY. If A = 0, then the F; are volume preserving. 


Now suppose we have a Hamiltonian H on T*M and we consider the vector 
field Xy on page 533. This gives us a 1-parameter group of diffeomorphisms 
é,: T*M — T*M generated by Xy, the “flow” of Xy. (We might only have 
a “local flow”, a local 1-parameter group of local diffeomorphisms, but for 
simplicity we will simply speak in terms of flows.) 


COROLLARY (LIOUVILLE’S VOLUME THEOREM). The maps ¢; of 


the flow of Xy are all volume preserving. 


PROOF. For 


ie) 


FO iced” DigsstsDa) SONG ena DisssenDi: 


the x; are the solutions (q', p;) of (H), and the x;’ are the (g', p;), so 


: 0p 
i= 


i= in 


Although we have stated this result as a theorem on R7”, we can easily restate 
it on T*M, in terms of the 2-form @ on T*M, as on page 512. 


Liouvile’s theorem plays an important role in thermodynamics, where we 
have to analyze a very large number of particles. If we consider all the particles 
in some region Up at some time fo, and the region U; occupied by the same 
particles at a later time ¢, Liouville’s theorem says that U; and Up have the 
same volume, and applying this to a small region, we conclude that the density 
of the particles remains constant. Of course, we should really think in terms of 
an infinite number of particles in order for the region occupied by a collection 
of particles to be defined, or else in some probabilistic terms. Fortunately, we 
don’t have to get into that here. 


The Interplay of Mechanics and Optics 539 


PROBLEMS 


“st 1. (a) In dimension 1, the harmonic oscillator equation mq” + w*q = 0 de- 
scribes the motion of a particle under an attractive central force —w*gq; the 
potential V = w*q*/2, and the Lagrangian is 

wq? 

Show (cf. Problem 16-1, applied only to g, not qg!) that the Hamiltonian is 





. 1 , 
L(q,q) = sng — 


z Ped 
P wr" 
CE aio a 


(b) For a particle in R*? moving under a force with potential V(x, y,z), the 
Hamiltonian 1s 





| 
F(x, Ys 2, Px» Py» Pz) = 5—(Px” + Py” + pz”) + V(x, Y,2). 
(c) For a particle in R* moving under a central force with potential function 
V(r), using coordinates g! = r, gq” = 6 where r and @ are the polar coordinates, 
the Lagrangian is (page 443) 
: ] ‘ 
L(r, 0,7, 0) = 5in(i + r707) — V(r). 
Denoting the corresponding p;, p2 by pr, pe, show that the Hamiltonian is 
1] 2 
H(r, 8, Pr, po) = 5— ( pr? + “5 ) + VO"). 
2m r 
(d) For the same problem in R? with coordinates (r, 0, ¢) defined by 
x =rsin@cos¢ 


y=rsin@cos¢ 





z=rcos@ 


[note that @ is different from that used for the spherical pendulum], we have 


ae 1 ; : 
L(r, 0,¢,7,0,¢) = 5in(i +7r767 +r sin® 667) — V(r), 





r2 r2 sin?” 5) ee. 


2. (From Landau and Lifschitz [1].) In terms of the formula on page 530, the 
partial derivative 0S/dq' can be written as 
0S d 


—zy=s| S(q),....q +4,...,q",t 
agh ele (q q +U,...,q ,t) 


I 2 Dor 
H(r, 6,9, Pr, Po, Po) = 2m Pi or 





d t 
as L ; 


where each cig,z) 1s a solution of Lagrange’s equations. Use the Boundary Term 
Corollary (page 462) to deduce that 0S/dq' (q,t) = pi(y¢q(t)). 


CHAPTER 18 
HAMILTON-JACOBI THEORY 


man muss immer umkehren 
one must always invert 


— Jacobi 


acobi, unlike many of the continental mathematicians, read the Philosophical 

Transactions of the Royal Society regularly, and Hamilton’s papers greatly excit- 
ed him and led him to reconsider the whole subject of dynamics. He looked 
at all the results from quite a different point of view, which resulted in many 
further developments. 

While Hamilton had regarded the equations (H) on page 529 as a tool for 
investigating the principal function, Jacobi brought these equations into promi- 
nence, anointing them with their now standard name, 


. ., OH , 0H 
Hamilton’s canonical equations g =—, pi =- =, 
Opi dq’ 
and he then turned his consideration to the equation that is known today as 
0S 0S 0S 
The Hamilton—facobi equation ae + H («'. a9 Bg? panies bgt ‘ = 0. 


The complete integral. Rather than trying to solve for S, and then using equa- 
tion (6) on page 531 to find equations for the p;(yg(t)), Jacobi approached 
Hamilton’s theory from the point of view of the general theory of first order 
PDE’s, especially in connection with the concept of a “complete integral” for 
a first order PDE. We will be able to state and prove the main result as soon 
as we ve explained what a complete integral is, though the next two optional 
sections are included for those who prefer some motivating ideas behind the 
result. 

For a general first order PDE on R”, with partial derivatives denoted by 
subscripts, as on page 316, 


; F(X1,..-,Xn,U,Ux,,.--,Ux,) = 9, 
L.€., 


F Migs eas My UO Ig sa hn Jo Ose pn) pais le OG 424m) = 0; 


we can often “solve” the equation in the sense of finding an n-parameter family 
of solutions, usually by the method of separation of variables, a method that 1s 
often used for higher-order equations also. 


540 


Hamilton-Jacobi ‘Theory 541 


For example, in Chapter 8 we considered (pages 314-315) a second order 
PDE, the 2-dimensional wave equation 


07u 
ot 
by looking for a solution of the form u(x,t) = X(x)T(¢), finding that we must 


have T”(t)/T(t) = v*X"(x)/X(x), so that the two sides must be a constant. 
For the heat equation for a function u on R, 


2 
u 
(ity v° a5 (x,t) or Up = Uy 


Uxx = Ut 
we obtain similarly X”"(x)/X(x) = T’(t)/7T(t) = K and thus the solutions 
asin(/—K(x — b))e*' K <0 


a asinh(VK (x — b))e** K > 0. 


In the case of a first order PDE, which is what interests us now, we usually 
look for a solution expressed as a sum. As a very simple example, consider the 
equation 

2 De x, 
Ux +Uy” = 1. 


Assuming a solution of the form u(x, y) = ¢(x) + w(y), we obtain 
(p'(x))? = a* = 1-—(W'(y))? for some constant a’, 
giving us the solutions 
u(x,y) =ax + (V1 —a?)y +b 


(the additive constant b is to be expected, since any solution u gives rise to the 
solutions u + 5). Similarly, in the equation 


on page 524, for constant v we have the solutions 


V(x,y,zZ) = Slax + by + (v1 —a?—b?)z +c]. 


For a general first order PDE on R” 
(1) O= F(xj,...,Xn,U,Ux,,..+,Ux,) = 9 


we define a complete integral to be an n-parameter family of “independent” 
solutions, that is, a function @: R” x R” — R such that each 


WM ts hp) SO Os so ns Oise saa) 


942 Chapter 18 


is a solution of (1), and such that ¢ satisfies 


(2) 0 # det ( iB ). 


da; OX, 





Condition (2) is actually only assumed to be true in some region about a given 
point (x, y,a,b), as our entire discussion is local, though for simplicity we will 
continue to write our equations blithely ignoring this fact. 


You can now skip right to page 547 if you want to avoid the computationally 
involved, yet revealing, motivation for Jacobi’s fundamental theorem. 
(Optional) Envelopes of solutions.! From the specialized set of solutions given by 
a complete integral we can obtain a much larger class of solutions by means of 
envelopes; for simplicity, we will illustrate the construction for the case n = 2. 
Consider the 1-parameter family of solutions 


u(x, y) = (x, y,a, w(a)), 


for some given function w. ‘To find an envelope for this family the first step is 
to “differentiate this equation with respect to a” (compare page 518), to get 


0 = ga(x, y,a, w(a)) + bo(x, y,a, w(a))- w'(a). 


Because of (2), the implicit function theorem implies that if this equation holds 
for some Xo, Yo, do, then in a neighborhood we can solve for a as a function 
of x and y, ie., there is a function A: R* > R such that the equation holds 
when we replace a by A(x, y), so that 


(3) 0 = ga(x, y, A(x, y), w(A(x, y)) 
+ dp(x, y, A(x, y), w(A(x, y)))-w'(AC, y)). 


We then substitute this solution back into u(x, y) = $(x, y,a, w(a)) to get 
D(x, y) = $(x, y, A(x, y), w(A(x, ¥))), 
and we claim that © 1s also a solution of (1). 


| Although the optional sections are self-contained, they are easier to understand if one 
already knows the basic facts about first order PDE’s, which can be found in the PDE 
Primer starting on page 667. 


Hamilton—facobi Theory 543 


To prove this, we note that 


®, = Px = [ba ae ppw’(A)]ax 
d, = dy + [ha + ppw’'(A)jay, 


where, in the usual way, the arguments of functions are ruthlessly suppressed. 
But equation (3) says that the term in brackets vanishes, leaving us with ®, = x 
and ®, = dy. Since all u(x, y) = $(x, y, a, b) satisfy (1), so that 


0 = F(x, y,o(x, y,a,b), bx (x, y, a,b), by(x, y, a, b)) 


for all a and b, this holds in particular for a = A(x, y) and b = w(A(x, y)), 
which just says that ® also satisfies the equation. ! 


Although we will not go into the details here, it turns out that by considering 
more than one envelope we can even arrange for a way to obtain the general 
solution for the first order PDE from a complete integral, with the additional 
steps involving only “algebraic” manipulations (including solving for implicitly 
defined functions), rather than ones involving derivatives. 


This entire analysis generalizes to the case of a complete integral for a first 
order PDE on R”. In this general case, we consider an (n—1)-parameter family 
of solutions 


U(X1,..-,Xn) = O(X1,..., Xn, A, W2(a),..., Wn(a)) 


and form the envelope by solving the n — 1 equations 


O = da(X,a, W2(a),..., Wn(a)) + > ba; (X,a,W2(a),..., Wn(a))w;' (a) 
i=2 
for a function A: R” — R such that the equations hold when we replace a by 
A(X1,...,Xn). 


(Optional) Inverting the process; contact curves. Though all of this might be 
considered of some theoretical interest, it certainly doesn’t seem very promising 
as a way to solve mechanics problems! Even if we have a complete integral for 
the Hamilton—Jacobi equation, we would have to find envelopes, and solve al- 
gebraic relations between them to get a general solution of the equation, before 
we could use equation (6) on page 531 to find equations for the p;(yq(t)). 


' Tt should be pointed out that this construction of the envelope of a 1-parameter family 
of a complete integral for a first order PDE on R? does not correspond exactly to 
what we did on page 518 for the 1-dimensional Clairaut equation and on page 519 for 
the 2-dimensional equation; in those cases we found an envelope of the entzre family 
of solutions for the complete integral, giving us the “singular solution” of the Clairaut 
equation, a topic we need not pursue here. 


544 Chapter 16 


The necessary PDE background for Jacobi’s quite different approach involves 
the “contact curves” along which the graph of ® intersects the graph of any 
one of the functions (x,y)  $(x, y,a,b) in our l-parameter family.! This 
amounts to saying that we are looking at points of the envelope where A(x, y), 
and hence w’(A(x, y)), are constant. Since A satisfies (3) on page 542 this 
means that the intersection curve can be parameterized as x(o), y(o), where 


fa(x(o), y(o),a,b) = Cao 


) ; 
b6(x(0), »(0),4,b) = Coo 


for certain constants Cg and Cy. 
Differentiating equations (4) with respect to o gives 


bax = + rea — =C, 


(9) 
dy 


$0. = + dbpy do = Cp. 


On the other hand, since ¢ is a complete integral we have 


0= F(x, y,o(x, y,a, b), bx (x, y, a,b), by (x, y,a,b)) 


for all a and b, so 


OF 

0 = a7 Fuga + Fodxa + FaGya 
OF 

0= ab = Iudo + Fodxb at Fg hyp. 


Using (4), and dividing by oF,, we obtain 


bxa(—Fp/oFy) + bya(—Fq/oFu) = C 


6 
| | xb (—Fp/oFy) = dyp(—Fq/oFu) = Cp. 


Comparing (5) and (6), and noting that by (2) on page 542 the determinant of 
these two equations is non-zero, we conclude that 


dx Fy dy _— Fg 








do oF,” do oF, 


! We will be obtaining, by direct calculations, results that are obtained as part of the 
basic theory of solutions for first order PDI’s, as in the PDE Primer. 


Hamilton—facobi Theory 545 


By changing the arbitrary parameterization 0, we can then assume that 


ax 

ea ame Fs 

do sf 
(7) 

ay _¢ 

do a 


In addition, by differentiating (1) on page 541 we get 


dp dq 
(8x) O= By + Fup + Poa + Fan 


and a similar equation (8y) for y. Now equation (7) shows that 


Op dq opdx oq dy 


"Pax + "99, ~ ax do * ax do 
dpdx odpdy 0q Op 
= —— + — — since — = — 
dxdo dydo dx dy 
ae 
~ do 


Therefore (8x) and the corresponding (8y) yield 
dp 


ae = —(Fy + Fy px) 
do 
(9) 
dq 
Ae = —(Fy + Fy py). 
We also have 
du dx dy 
qe ge ae eee by (7). 


Combining this with (7) and (9), we have altogether 


ax F dy _ 
do” do 7 
d d 
(C) SP =—(Fe+Fupx), <= —(Fy + Fupy) 


do 


do 
du 
ae = pkp + qkg. 


546 Chapter 16 


For the general case of a first order PDE 





OF Liiseing hp UG cen) SL Ole MRI Digs Da) 
we find, completely analogously, that the contact curves give us the equations 
ax; 
Cl a ee 
( ) do Pi 





dpi 
C = —(Fy, Fy Di 
(C2) Jo (Fx; + Fu pi) 


du : 
— =) piFp. 
(C3) da LF Pi 


Now consider, in particular, the Hamilton—Jacobi equation, written in the 
usual notation used for first order PDE’s as 
Ou Ou 
(*) — + H(x1,...,Xn, Pi,---, Pn  t) = 9, pPi=—. 
ot OX; 
We have an equation in vn + | variables, x1,...,Xn,¢ and it will be convenient 
to temporarily write ¢ as Xn+1, so that du/dt is just pn41, and we have the 
equation 
O = F(x),...,Xn,Xn41,U, Pi,---5 Pns Pn+1) 
= Pntit H(x1,...,Xn, Pi,--+s Pans Xn+1). 


Note that while the partial derivatives of u appear in the equation, u itself does 
not, so fy, = 0. 


In (Cl), the equation with i = n + 1 1s just 
AXn+1 = OF 


so that we might as well take the parameter o to be x,41 =¢. Since Fy = 0 
we find that the equations (CG) become 














ax; 0H 
I — 
so dt OD; 
dpi 0H 
C = — 
(G2) dt OX; 
“OH , ou ". OH 
(C3) o = ps fp oe 
i=l i=] 


Note that (C1) and (C2) are just the canonical equations for H, while (C3) says 
that du/dt is the Legendre transform of H (compare equation (3) on page 530). 


Hamilton—facobi Theory 547 


JACOBYP’S THEOREM. Now suppose we have found a complete integral of the 
Hamilton—Jacobi equation, written in the usual notation for first order PDE’s, as 


du ou 


(*) gp + G++ Xm Piss +s Pat) = 9, Dae 


Jacobi’s idea was not to solve for u (the principal function S), but ¢o solve the 
corresponding canonical equations. 


[This idea is so attractive and looks so promising because, as explained in the 
optional sections, the canonical equations are precisely the equations for the 
contact curves that we get when we produce new solutions by taking envelopes, 
which suggests that in trying to solve the canonical equations directly we might 
be able to short-circuit the whole process of taking envelopes. | 


The equation (*) is special, in that wu itself does not appear in the equation, 
only the partial derivatives of u, which requires a bit of jiggering with the 
definitions. Clearly if u is a solution of (*), then so is u + a for any constant a. 
So a complete integral amounts to a function ¢: R” x R” x R => R such that 





(a) UX aig Xnst) = Oise 5s Cisse Gael) +6 
is a solution of (*) for any constants a1,...,@, (and a), and condition (2) on 
page 542 now says that at all points of the region of interest 
ao 
b 0 # det iA =1,...,0. 
aon ea : " 


With these matters settled, we now consider the set of equations 
0 
Bg Phe 2 Mas 15+ 1 Anst) = b7, | oo eres | 
sd 


for any constants @),...,@y, and b,...,5,, where, with the usual abuse of no- 
tation, the symbol 0/da; really means the partial derivative of ¢ with respect to 
its (n + 7) argument. Because of condition (b), the implicit function theorem 
implies that if this equation is satisfied for some (X1, dls is to), then in some 
neighborhood we can “solve implicitly for the x; as functions of ¢” that is, there 
are functions X1,..., X, for which 


0 ; 
(c) ay PUA). «++ Xn(t), a1, ++ dnt) =i l<j <n. 
J 


We claim that the X; automatically satisfy the equations (with arguments on the 
right omitted, as usual) 

aX; 0H 

dt Opi 








948 Chapter 16 


‘To prove this we first take the partial derivative of (c) with respect to ¢, to get 
“. 0p dX, Ph 
 dajdx~ dt — dajdt 








(d) 


[d/dx, simply signifies the partial derivative with respect to the k argument]. 
Next, we note that since (a) is a solution of (*) we have 


(e) et H(x — X Mi — vet) =0, 





dt 
and taking the derivative with respect to a; gives 
ao “.dH Ph | 
da; ot ae ODk 0a j OXz 7 








(f) 


[/dp, signifies the partial derivative with respect to the (n + k) argument]. 
Comparing (d) and (f), we find that 


“. «Od dX, 0H 
a — —~— | = 0, Lan, 
poy da; OXk dt ODk 
which by condition (b) implies that we do indeed have 


aX, _ aH 
dt a.) 





(11) 
Moreover, if we now set 


0 
pit) = (bo 0) poe eekly Xn(t),41,...,4n,t), 
then differentiating with respect to ¢ gives 
apj _ a> dX, 07 


8) dt = 2 eae di‘ dx;dt’ 








while differentiating (e) with respect to x; gives 


076 OH dH #&¢ 











h = — — ; 

wn) Oxjdt Ox; a OPk OX, OX; 

Subtracting (h) from (g) and using (H1) just proved, we get 
api 0H 

H arena | 

vel dt ax; 


Note: Now that we’ve taken care of all the details, we can afford to be sloppy 
and write the functions X;(t) in (c) simply as x;(¢). 


Hamilton—Jacobt Theory 549 


The discussion on pages 542-546 provides motivational background, but 
the straightforward calculations on the previous page give Jacobi’s complete 
proof, in his Vorlesungen tiber Dynamtk, of this beautiful theorem, which you will 
hardly ever find mentioned in a modern mechanics book: 


JACOBP?S THEOREM. Given a complete integral ¢ of the Hamilton—Jacobi 
equation, the solutions of the corresponding canonical equations for H can be 
found by choosing constants a; and b;, solving the equations 

0 

ae) ere ee a, eee ,dn,t) =), 
for x;, and then sctting 


0 
pj(t) 7 s(x (snl). dt dnt). 
J 


It might seem that the usefulness of Jacobi’s theorem is somewhat limited by 
the fact that the functions x;(t) are described only implicitly, by the n equations 


) 
so (x1(O)s Aalto ats +5 dns2) = bj. 
j 


Remember, however, that a complete integral of @ is usually found, as in the 
examples below, by separation of variables, so that @ is actually of the form 


D1 (%1,41,..-,4n) +-°> + On(Xn, a1,.-., an), 
which usually makes things a lot simpler. 


Jacobi’s theorem and mechanics. We will now examine how Jacobi’s theorem 
can be used to study the equations of mechanics, which modern mechanics 
books usually postpone until the material of the next chapter has been presented. 

We switch back to g' for our basic variables, with the solutions to the canonical 
cquations for H corresponding to the solutions to Lagrange’s equations for the 
Lagrangian L, and to streamline notation we now use S for ¢, so that Jacobi’s 
theorem states that for a complete integral S of the Hamilton—Jacobi equation, 
solving the equations 


os 
5 4 (), gue (iG icevcgdnst = BF 
aj 


for the q' gives the solutions of Lagrange’s equations for L, with the conjugate 
momenta p; given by 


as 
= 3qi 4 Os P"On a1, -.-1Anst), 


We don’t even care what the solution of the Hamilton—Jacobi equation is, or, 
for that matter, what the equation is actually an equation for! 


550 Chapter 18 


It also might seem strange that we have replaced the problem of solving a 
system of n second order equations, or of 2n first order differential equations, 
with the problem of solving a partial differential equation, normally considered 
to be a harder problem. This will be explained to some extent in Chapter 19, 
but in any case, for most familiar problems of mechanics a complete integral 
can be found easily. In fact, in some cases finding a complete integral may 
be easier, or more straightforward, than solving the corresponding system of 
ordinary differential equations. In Addendum A we mention some problems 
that were solved precisely by this method, but for now we simply illustrate how 
the method works, by considering a few familiar simple cases, where, as might 
be expected, this method may actually be considerably more cumbersome than 
the elementary computations. 

The Hamiltonians for all these examples have been derived in Problem 17-1. 


¢ Harmonic Oscillator. ‘Taking m = 1 for simplicity, the Hamiltonian is 
H(q, p) = 5(p* + °°), 


so the Hamilton—Jacobi equation is a partial differential equation in two vari- 
ables g and f, 


aS. APOSY\" . dees 
(a) tala) a ee = (): 


Now ¢ is “cyclic’—the variable ¢ itself doesn’t appear in the equation, only 
derivatives with respect to t—which suggests that as a first step in finding a 
complete integral we separate out t by looking for a function S of the form 


S(q,t) = W(q) + f(t). 


This will be a solution if 


0 
f+ A(s. ar) Ses 2-4 G 


for some constant a, so that we get the equations 


(al) S(q,a,t) = W(q,a)—at 
ow (q, 
(a2) a(a ee) = 


We write S(g,a,t) and W(g,a) even though a is a constant, because we are 
trying to find a general integral S for (a), that is, a solution depending on a 
parameter a. Since H doesn’t depend on f¢, we will have a = E for the solution 
we ultimately obtain (page 529). 


Hamilton—facobi Theory 551 


Equation (a2) is 


2 
i dW(q, a) ae Lege —q, 
2 0g 2 


giving 


2 2 
wa.a)=0 f JS -¢ dq, Sq.at)=o f (S-@ dq — at. 
W WM 


There’s no point doing the integration right now, since we want to define q(t) 
implicitly by the formula 
OS(q(t),a,t) 
da 7 


for a constant b (here, of course, it is definitely necessary to recognize the de- 
pendence of S on @!). So we want 


b 


(b) CAE a cere 
dat o) 0 ; 
2 
/ a (q(t)) 
giving 


pis ae ( es ©) 
= —— arccos | ——-> 
W V/ 20 : 


or, finally setting a = E 


? 


E 
cosw(t + b). 





(b’) q(t) = 


60) 


The constants FE and b can be determined from the initial values, say the 
values go, Po of g, p at t = 0 (but only once the problem has been solved). 
Evaluating (b’) at t = 0 gives 0 = —E + 4 po” + 4w7q6 and substituting this 
value of E into (b’) gives an equation for cos wb, allowing b to be determined. 
In the particular case where ¢ = 0 is a maximum or minimum point, po = 0, 
we simply have E = w*qo*/2, and (b’) gives go = qo cosw(t + b), so we can 
just take b = 0. 

Although this hardly looks like a more convenient way of analyzing the har- 
monic oscillator, it has the virtue of following a systematic method that doesn’t 
depend on any previous knowledge, or any guessing, about the solution. ‘The 
next example, in more than one dimension, makes a somewhat better case. 


552 Chapter 18 


¢ Central force in polar coordinates. In this example, we will now separate 
out not only ¢, but also one of the two space variables. We have 


P ] ; 
L(r, 6,7, 0) = sin (i + r767) — V(r), 


l 
H(r, 0, Pr, Pe) = am (» + oe) + V(r). 


Once again separating out ¢, we seek a solution of the Hamilton—Jacobi equation 
of the form 


(al) S(r,0,a,t) = W(r,6,a)—at 


1 | (aw \* 1 (aw? 
(a2) ee (Fe) = 72 (sr) | + V(r) =a. 
Now 8, like £, 1s cyclic, leading us to look for a solution of (a2) of the form 


Wir, 8,a,a9) = R(r) +ag-0 


for a constant ag with 


tt Bc janth 
‘Then 
ns (R'm)? + 00 +V(r) =a 
2m r2 
hence 


2 
W(r, 0,a,09) = | \j 2m(a — V(r)) — az dr + agé. 
r 
So for two constants b,,b2 we want 
a ee: eee 
0a a) cg? 
-@=VOy—— 
m r 
agdr 
b 0. 
(b2) == Joa=*+> =f 


“(a — Vr i 


(b1) 


Since a = EF, equation (bl), with ag = Ah, is equivalent to (A) on page 121, 
giving r as a function of ¢, and (b2) is equivalent to the equation on the top of 
page 122, determining the shape of the orbit, with b2 = 6(0). 


Hamilton—facobt ‘Theory 593 


¢ Central force in spherical polar coordinates. We now consider the central 
force problem in 3-dimensions, mainly as an ulustration that a complete integral 
may sometimes be found even when more than one variable is not cyclic. Using 
the coordinates on page 539, we have 


. 5 l ; . 
L(r, 6,¢,7, 0,6) = snr + r?67 + r* cos” 6 6”) — V(r) 


I per i 
H(r.8., Pr. Pos P8) = 5~ (pr? Tag te =htz) + Mn, 


leading to 


(al) S(r,6,¢,a,t) = W(r,9,¢,a) —at 
l aw\* 1 (aw? l aw \? 
Fal 2m ($) af r2 (=) = r? cos? 6 (=) | a 


The variable ¢ is cyclic, but r and @ are not. Nevertheless, we still look for a 
solution of the form 


Wir, 0,0) = R(r) + O(@) + BG). 


Since @ is cyclic, we have, as in the previous example, ®(¢) = ag - ¢, for the 
constant ag, which is just the constant conjugate momentum pg. We then have 





2 
21 p/2 _ _ le? mk 
r [R + 2m(V(r) a) | = le + a ar 


Since the left side depends only on r and right side only on 6, we set them both 
equal to a constant —ag*, so that we have 





2 
~\12 ok rr 12 _ od 
(a3) @’* + Cy a Oe R'* + 2m(V(r) — a) + le 0, 


and thus W = W(r, 6, ¢, a, a9, ag) is given by 


ag? I Ay? 
— aD a7 — d 2 
Us m | Ss ca 2mr2 ‘ +/ cos? 6 


For now, we simply use this example as an illustration of the process of finding 
complete integrals, leaving the intricacies of the solution to Problem 1. 














594 Chapter 18 


¢ Arbitrary force on a particle. For a particle of mass m under a force with 
potential V(x, y, z), where (Problem 17-1 (b)) we have 


1 
HO S2. Dis Disp) = ales + py” + pz”) + V(x, y,2), 


we now obtain the equation 


aw \? aw \? aw \? 
(Se) +(G) + (Ge) =2me-”. 


We don’t try to solve this in general, of course, but we want to point out that 
when V = 0, this has the same form as equation (C) on page 524 for the 
characteristic function in optics. For W(x, y,z) = X(x) + Y(y) + Z(z) we get 
X' = ax, Y' = ay, Z' = a; for constants ax, ay, and az, and thus py = ax, 
giving xX = axt + by for x(0) = by and x‘(0) = ax, and similarly for y and z. 

Hamilton’s characteristic function. In all of these examples, as in most problems 


of mechanics, the variable ¢ didn’t appear in the Hamiltonian, so that we had 
an equation of the form 


0S 0S 0S 
—+H{q’,...,q",—,...,—] =9, 
(a tga gp ge 
and we separated out the variable ¢ to get 
(al) S(q',....q",t) = W(q',...,q") —at 
ow OW 
H(q',...,.9",—7,-..5-—) =@, 


with a normally being the energy F of our path. 

The function W has already appeared in another context. Note that in equa- 
tion (3) on page 530 the term H(yg(t)) is simply E, since we are assuming H 
does not depend on f, so 


W(q,t) = S(qg,t)+ Et 


=| >; 0 vat )(q! © yq)'(t) — H(yq(t)) dt +f H(Yq(t)) dt 
ga) 


t 
= | eer! ov)'@ dr, 
0 ir] 
which is just the integral of the action (page 464) along the curve. As mentioned 
on page 527, Hamilton originally used this integral as his “characteristic func- 
tion” for mechanics, which he later discarded in favor of his principal function S. 
So (a2), known as the “reduced Hamilton—Jacobi equation”, is also referred to 
as the “Hamilton—Jacobi equation for Hamilton’s characteristic function”. 


Hamilton—facobi Theory 555 


HAMILTON-JACOBI THEORY AND THE SCHRODINGER WAVE EQUATION. 
While Jacobi’s presentation of Hamilton’s work had the beneficial effect of sim- 
plifying many of Hamilton’s constructions and introducing Hamiltonian me- 
chanics to a much wider audience, it also had the unfortunate side-effect that 
the connection with Hamilton’s earlier optical investigations was almost com- 
pletely neglected.! 

This connection was only brought back into prominence with the advent of 
the Schrodinger wave equation. Building on de Broglie’s notion that a moving 
particle with momentum p and energy E has associated with it a “wave” with 
wave length A and frequency v satisfying 


AN=hl pp. VE] h for Planck’s constant h, 


Schrodinger sought a wave-like equation to explain the emission spectra of 
atoms, which have the quantum mechanical aspect of taking on only discrete 
values. After publishing his basic papers in German physics journals in 1926, 
at the end of that year he published an English review article, Schrodinger [1], 
outlining how one can be led to his equation by exploring the correspondence 
between Hamilton’s optical and mechanical theories. 


Schrodinger begins with the simple example of a particle of mass m acted on 
by a force with potential function V, as on page 554, where the equation for 
Hamilton’s characteristic function W can be written as 


aW\* (aW\* (aw? 
| grad W| = |(Wy, Wy, Wz)) = 4) ) ——] +1 =—-] +1-—-])] = v2m(E-V). 
Ox oy OZ 
Since grad W is always perpendicular to the sets W = constant, for small Aa 
the perpendicular distance d between a point of W =a and the set W =a+Aa 


Aa 
|grad W | 


~~ 
“~~ 





W-—-a+Aa 


is approximately Aa/| grad W| evaluated at that point. 


! Leading to the historical irony that Bruns [1] in 1895 reinvented Hamilton’s charac- 
teristic function for optics, which he called the ezkonal, from the Greek eim@v = image, 
the term now standard in optics, and apparently unaware of Hamilton’s optical work, 
though acquainted with Hamiltonian mechanics, he remarked that “The eikonal con- 
cept now plays an entirely similar role as the Hamiltonian viewpoint in mechanics, 
though, to be sure, in the far narrower domain of geometrical optics.” 


556 Chapter 16 


We can also consider the surfaces S = constant, though these constants are 
out of step with the constants for the surfaces W = constant. At time ¢t = 0 the 
surface S = So is the same as the surface W = So, while the surface S = So + At 
is the same as the surface W = Spo + EAt. So, if we think of the surfaces S = t 


dm EAM 
5 =S S=So+At | grad W| 


W=So W=S8S.4+ EAt 


as parameterized by time f¢, after a short time Aft, a point on the surface S = So 
will have moved a distance of approximately E At/| grad W|, with a velocity 
close to 


EAt E 
(1) v= 


lgrad WIAt  \/2m(E—V)’ 


which plays the role of the “velocity” v in our wave equations on pages 315-317 
and in Addendum 15B, except that it is no longer a constant. 
[Since 


aw aS _ aw _as aw _ aS _ 
ax ax oy ~ By 7”? dz Op 


Pz; 


the vector grad W is just the momentum vector p of the particle, so this veloc- 
ity v varies inversely with the velocity of the particle, | grad W|; as on page 526 
(with S playing the role the optical characteristic function V), when the wave is 
moving fast, with v large, the “wave fronts” S = constant are close together, so 
that p 1s small.] 

Schrédinger concludes the first section of his paper by noting Hamilton’s 
correlation of Huygens’ construction with Fermat’s principle, and begins the 
second section by saying: 


Nothing of what has hitherto been said is in any way new. All this 
was very much better known to Hamuton himself than it 1s in our day 
to a good many physicists. Indeed, the theory of the propagation of 
light in a non-homogeneous medium, which Hamilton had developed 
about ten years earlier, became, by the striking analogy which occurred 
to him, the starting-point for his famous theories in pure mechanics. 
Notwithstanding the great popularity reached by the latter, the way 
which had led to them was nearly forgotten. 


Hamilton—facobi Theory 557 


This optical-mechanical correspondence, however, really applies only to gceo- 
metrical optics, which is merely an approximation to the wave theory of optics, 
and Schrédinger compares this to the fact that 


. ordinary mechanics is really not applicable to mechanical systems 
of the very small, viz. of atomic dimensions. ‘Taking into account this 
fact, which impresses its stamp upon all modern physical reasoning, 1s 
one not greatly tempted to investigate whether the non-applicability 
of ordinary mechanics to micro-mechanical problems is perhaps of 
exactly the same kind as the non-applicability of geometrical optics 
to the phenomena of diffraction or interference and may, perhaps, be 
overcome in an exactly similar way? 


Answering his own question in the positive, he notes that 
... Well known methods of wave-theory, somewhat generalized, lend 
themselves readily. The conceptions, roughly sketched in the preced- 


ing are fully justified by the success which has attended their develop- 
ment. 


Just as the equation for the vibrating string with fixed ends (pages 314-315) 
gives us solutions A(x) sin(@,t +) for suitable constants wy, Schrodinger looks 
for a wave-function w of the form 


w = A(x, y,z)sin(S/K) 
= A(x, y,Z)sin (—Et/K + W(x, y,z)/K), 
for some constant K. Since S has the dimensions ET of action, the constant K 


must also have these dimensions, which is the siren call of Planck’s constant h. 
In fact, since the frequency of our wave is 


v= E/2nK, 


we can get de Broglie’s relationship v = E/h by choosing K = h/2rz. 
Now we want to look at the wave equation 


Ay — y"/v? = 0, 


and substitute for v the expression given by equation (1). Since w depends only 
on the space coordinates and on the frequency E/h, we want the same to be 
true for y”. This means that the dependence of y on time should involve only 
the factor e+27!4£/" | so that 


ye" == —4n* E*y/ h?, 


998 Chapter 16 


leading finally to the Schrodinger wave equation: 


Aw + 827m(E —V)w/h? = 0 


This 1s not, of course, a derivation, but something closer to a divination, fitting 
quite comfortably into the tradition of the investigators into the theory of light 
who preceded him. And it worked: 


Putting for instance 
V = —e?/4, 


(e = electronic charge, r = (x?-+r?+22)2, we get for the simplified 
hydrogen atom or one body problem: 


Aw +1°m(E + e?/r)w/h? = 0. 


Now this equation for a great part of the possible values of the energy 
or frequency constant EF, proves to offer no solution at all which is 
continuous, finite and single-valued throughout the whole space [and 
approaching 0 at oo. The set of negative values that do,]| 


ES —2n*me*/h*n? (n = 1, 2,3, 4---) 


. corresponds exactly to Bohr’s stationary energy levels of the 
elliptic orbits. 


Which is all we will have to say on this matter, for a long, long time. 


Motion in the Field of Two Fixed Masses 559 


ADDENDUM 18A 


MOTION IN THE FIELD 
OF TWO FIXED MASSES 


GEODESICS ON ELLIPSOIDS 


We consider the problem, posed by Euler in 1760, of a body moving in the 
gravitational field provided by two fixed masses. ‘That would seem to be only 
of theoretical interest, since the two bodies certainly won’t remain fixed, so 
that it is more like a poor-man’s version of the restricted three-body problem 
considered in Addendum 10A, where the two bodies remain at a fixed distance 
from each other. On the other hand, it can be applied, for example, to the 
problem of a negatively charged body moving in the field of two fixed positively 
charged bodies. ‘The solution even has a use in modern-day problems involving 
gravitation, as we will mention a little later. 

Jacobi’s method for solving this problem depends on a suitable standard co- 
ordinate system called elliptic coordinates. We will consider only motion in a 
plane, with the two bodies located at (c,0) and (—c,0), and for simplicity as- 
sume the two bodies have the same mass m, with the third body simply having 
mass 1. If r; and rz are the distances from a point in the plane to the points 


(—c,0) (c, 0) 





(c,0) and (—c, 0), we define 

A=35(rn1 +12), w= 5(r1 — 12), 
with the curves A = constant being ellipses having the fixed points as foci, and 
the curves 4 = constant being one branch of a hyperbola with the same foci. 


A not inconsiderable amount of grunt work is first required to obtain the 
basic facts about these coordinates. From 


ry? =(x—c)? + y7, ro* =(x+e)*4+y? 


we get 
(a) x? 4 yy? te? = $ (re? +117) =A? + pe? 
(b) cx = (127 — 17) = —Ap. 


Substituting 44 = —cx/A into (a) then gives 


(c) c7y? = (A* —c*)(c* — 2’). 


560 Chapter 18. Addendum 18A 


If the path of the body has coordinates x(t), y(t) and corresponding A(t), w(t), 
we then have 











c*y"* = (A* — c*)(c* — py”) (53 eee ee 2 


leading finally to 
re F(x’? ae y’) 


/ / 2 
= aw + ip)? + (A? — c7)\(c? — 2’) ( AM | 








Ic2 N2—¢c2 2 — 2 
\i2 Vises 

So Ae 2 

= 3( -1)(p—s +45), 


while for the potential function V we have (taking the constant of gravitation 
as 1 for convenience) 


Pe ae --(;2-+;)-- i 
An md At Aap) RA 


Thus the Hamiltonian is 





> A*-c* gc? — 2mA 
12 — 2 + 7 Pu M2 — 2 2 = 2? 





H =35pa 


and the reduced Hamilton—Jacobi equation 1s 


aw \? aw \? 
Ge lee el (c? — pw?) = 200? — pW) + 4m, 


which is easy to separate, writng WA, uw) = A(A) + M(w). The resulting 
equations give explicit expressions for the motion of the body in terms of elliptic 
integrals. 

Considerable discussion of the orbits, where the two fixed bodies also need 
not have the same mass, is given in Pars [1; §17.10]. ‘This problem is also treated 
in Boccaletti and Pucacco [1]; §5.6], with §5.7 explainmg how the problem can 
be applied to satellites in the gravitational field of the oblate-spheroid shaped 
earth, which is approximated by two fixed bodies whose masses and distances 
from the center are imaginary numbers! 


Motion in the Field of Two Fixed Masses 561 


Jacobi also defined elliptical coordinates for 3-dimensional space; they are the 
three roots 41, A2, and A3 of the equation 


ge y? 2 
A) = ——~- + ———— =| 
Oy OY a Fg cos 
For A < a* we obtain ellipsoids, for a* < A < b* hyperboloids of one sheet, and 
for b* < 4 < c* hyperboloids of two sheets. For any (x, y,z) with x,y,z #0, 
the function g(A) — 1 clearly must be 0 for at least one A; < a’, one Az with 


7 


a* < Ay < b*, and one A3 with b* < 43 < c*. There are only 3 roots, since 
g(A) — 1 = 0 1s equivalent to a cubic equation in 4. Thus one surface from 
each family passes through each such point (x, y,z). At a point (x, y,z) on the 


for given 0 < a* < b* <c’. 









surface g(A;) = 1, the normal vector has the direction 


l x y g 
—(D,g(\;), Dog), DagQi)) = | =. =>. = — _ ] - 
5 (Dig(hs), Dogs), Dash) = (ap az | 
At a point (x, y,z) on the two surfaces g(A;) = 1 and g(A;) = 1, the inner 
product of the two normal vectors is therefore 


G2 y2 ye 


@? A )@ Aj) PANG Ap * (=ANE=A) 
which can be written as 

gQi)— sj) _ 9 
Aj —Ai 

so our system is orthogonal. 
Jacobi used these coordinates for the purely geometric problem of geodesics 
on ellipsoids. A description of his results may be found in Arnold [2; §47]; the 

equations themselves are derived in Fasano and Marmi [1]; $11.2]. 


962 Chapter 16 


ADDENDUM 18B 


HUYGENS’ CONSTRUCTION 
FOR HYPERBOLIC EQUATIONS 


We consider second order PDE’s on R” x R of the form 


n 


(*) > ay ie st) Se 1), 


ij=l 


where the symmetric matrix A = (aj;;) is everywhere positive definite, so that 
at each point it can be reduced to the wave equation in R”. For n = 3, equa- 
tion (*) is the general wave equation in 3-space for a possibly non-homogeneous, 
non-isotropic medium, as discussed on pages 494-495, in the case where our 
indicatrices are simply ellipsoids. 

The basic theory of such PDE’s leads to a result that truly deserves the name 
of Huygens’ construction, though it is also sometimes called Huygens’ principle. 
We will only state the result here, but even the statement has been postponed 
until now because it relies on ideas that are illustrated to some extent in the 
PDE Primer at the end of this book, which was first mentioned in this chapter. 

A surface in R?, or more generally an (n—1)-dimensional submanifold of R”, 
is called a characteristic surface when there are two different solutions of (+) 
that are tangent along the surface, or equivalently, if initial conditions along 
the surface, together with initial conditions for the first derivative along the 
surface in the direction of a normal vector field, do not uniquely determine the 
solution in the region to which the normal vector field points. They play a role 
for second order PDE’s analogous to that of the characteristic curves for first 
order PDE’s. 

Physically, characteristic surfaces are of interest because they represent “wave 
fronts”: if we consider a solution u of the wave equation representing the result 
of a disturbance beginning at a point (or more generally a closed set) at time 
t = 0, then at time ¢ the solution will be 0 outside of some surface W;, the wave 
front at time t, so W; will be a characteristic surface, since the 0 solution and 
the solution u will agree along W;. 

Now suppose that for each point x € W; we consider the solution of the same 
wave equation for a disturbance starting at x at time ¢ = 0, and look at its 
wave front wx at time At. Then the envelope of all wx has two components, 
and it is a theorem that the one that lies in the region where u = 0 is the wave 
front W;+ar. his situation is again analogous to various considerations for first 


order PDE?’s discussed in the PDE Primer. 


Huygens’ Construction for Hyperbolic Equations 563 


What about the other component of the envelope, the one within the region 
that the wave has already reached? This question presents a problem only if we 
allow the purely mathematical content of this result to be confused with some 
presumed physical mechanism. Courant and Hilbert [1; Chap. VI, $1.9] points 
out that though we find “what seems at first sight to be a paradox”, it is easily 
put to rest by the observation that “A characteristic surface can, but need not, 
contain discontinuities of the solution u, and the envelope construction can also 
lead, without contradicting our theory, to surfaces on which the wave is not 
discontinuous at the time ¢.” [Italics mine.] 


This sort of resolution was not available in the original, very mechanistic, 
theories of light. After Huygens’ description of secondary waves (page 491), he 
adds that it would seem that the particles of the ether should all be of equal size, 
“because otherwise there ought to be some reflexion of movement backwards 
when it passes from a smaller particle to a larger one”. 

When Fresnel introduced Huygens’ hypothesis of secondary waves into his 
transversal wave theory of light in order to account for diffraction, Huygens’ 
mechanistic theory itself wasn’t applicable, and Fresnel had to introduce an 
elaborate ingenious argument, together with just the right ad hoc assumptions, 
to show that the interfermg secondary waves would exactly cancel out in the 
backwards direction. 

Once the theory of light became part of Maxwell’s theory of electromagnetic 
waves, the idea of secondary waves became not only unnecessary, but basically 
absurd: electromagnetic waves are emitted by accelerating charges, not by other 
electromagnetic waves. Then, of course, the problem became to explain why 
Fresnel’s calculations did work, even though they were based on a fairy-tale 
construction. ‘This was accomplished by Kirchhoff, who showed that Fresnel’s 
construction could be regarded as an approximate from of an integral theorem. 

Detailed explanations of Fresnel’s constructions, and of Kirchhoff’s integral 


theorem, may be found in Born and Wolf [1; §§8.1—8.3]. 


964 Chapter 16 


PROBLEM 


1. Consider the central force problem in spherical polar coordinates, with W 
as given on page 553. 


(a) From the formula for L we have 
Po = mr’ cos* 6 ¢. 


Calculate that this is the vertical component of the angular momentum L. 
Then use (a3) to show that 


=m 24 (62 + cos” 6 ”), 





ag” = per eee a 


and conclude that ag@ 1s the length L of the angular momentum L. 
We will henceforth set 


a=E the constant energy of the final solution, 
ag = L, 
ag = L3 as a convenient abbreviation. 


(b) For three constants b;,b2,b3, we have the equations 


(bl) bee ow _ 12 see 
JE 2 = 
ret) = 2mr? 


_ —Ldr/r 
(b2) b= 5 = te 


Ldé 
Cl = 
La? 


cos 6 








Ds 





ow —L3d0/sin* 6 
_ OW 3 a0/sin n 





cos2 @ 


Hamuilton—facobi Theory 565 


a sec” 6 dé 
3 72 ) 
(5 = ') — tan? 0 
3 


and noting that sec? @ = d(tan @), conclude that 


(c) Writing (b3) as 


L2 
(1) tan@ = (= — 1) sin(¢@ — b3), 
which is a plane orbit, since it 1s linear relation between the direction cosines 
(cos 8 cos ¢, cos @ sin ¢, sin 8). 
To connect this with the standard astronomical terminology, consider the 
following diagram (with the similar diagram for the Euler angles on the right). 


planet 












<2. A Re J iehcion 


@ 


\ 
‘ 









line of nodes 


The horizontal plane on the left 1s the “plane of the ecliptic’, the plane in which 
the earth’s orbit lies, with the x-axis chosen so that it points toward the vernal 
equinox (which is used to denote both a direction, and a time). ‘The angle i 1s the 
“inclination”, while , a.k.a. ¢, is the “longitude of the ascending node”, and a, 
a.k.a. W, 1s the “argument of the perihelion”. We’ve also added the normal n to 
the orbital plane, which is not a standard element of the astronomical picture. 


Show that 


n = (sin/ sin Q, — sini cos Q, cos), 


and letting 
x = r(cos 6 cos@¢, cos @ sin ¢, sin 8) 


(all letters in both of these equations are functions of time ¢), show that (n,x) = 0 
is equivalent to 


(11) tan 8 = tani sin(¢ — Q); 


comparison of (1) and (11) then shows that 


: L3 
b3 = Q Osi = —., 
(111) 3 cos 1 7 


566 Chapter 18 


(d) In (a) of the figure below, we show the angle 0 from the line of nodes to 
the planet, measured in the plane of the orbit. Part (b) shows the planet at 





position A, at distance r from the sun S. Note that ZABC = i. Determine 
AC and AB, and conclude that 


(iv) sin @ = sini sin Vv. 
Using (iu) and (iv), write the second integral on the right of (b2) as 


— Ld 7 cos 6 dé 
J Vie? =e) — L32 sec? 0 /cos2 @ — cos? i 


and then use the substitution (iv) to show that this integral 1s simply 0, so that 
equation (b2) becomes 


L dr 
¢—bo = — ee 
: al Be 
r2,/E —V(r)— mp2 


which is the first equation at the top of page 122 for the polar coordinates (r, ?) 
in the orbital plane. 

For V = —mK/r, we have the solution on pages 123-124, from which we see 
that b2 = w, the value of ¥ for the perihelion. Equation (bl) then shows that b, 
is the time ¢o at which the planet is at the perihelion. 





CHAPTER 19 


CANONICAL 
TRANSFORMATIONS 


... [The advantages of the Hamiltonian formulation 
lic not in its use as a calculational tool, but rather in 
the deeper insight it affords into the formal structure 
of mechanics. ... we are led to newer, more abstract 
ways of presenting the physical content of mechanics. 


— Goldstein, Classical Mechanics 


his disclaimer, appearing at the beginning of the chapter on canonical 

transformations in Goldstein’s standard text Classical Mechanics, serves as 
a warning to physicists that a barrage of mathematics is about to ensue, long 
before any physics makes an appearance. But for our readers, an apology is 
presumably not required for beginning with an extended mathematical presen- 
tation. We will take the time to consider several different approaches to the 
study of canonical transformations, and their mterconnections. 


Canonical transformations. Our oft-used device of choosing new coordinates 
to simplify problems can also be expressed in terms of a “transformation” or 
mapping: for example, for the map a(r,@) = (rcos6@,rsin@) of a portion of 
M = R?’ to itself, the Lagrangian L oasx: TM — TM may lead to easier 
equations than the original Lagrangian L. 

Similarly, we might hope that a diffeomorphism f: T*M — T*M [not nec- 
essarily of the form g* for any g] can change Hamilton’s equations for a given 
Hamiltonian H into a set of equations for a simpler Hamiltonian K. So we 
first want to know which transformations always take equations in Hamilton- 
ian form into new equations that are also in Hamiltonian form. Actually, most 
classical investigations implicitly asked when equations for H become equations 
specifically for Ho f—! (for the general formulation, see the bottom of page 571). 

We work locally, and think of each cotangent space Mp* simply as R”. To 
determine which transformations have the desired property, it 1s natural to in- 
troduce the 2” x 2n matrix 


fl 
0 Pee where J, denotes the 
—I, ! 0 n Xn identity matrix, 


567 


968 Chapter 19 


since the Hamiltonian equations for a curve c in T*M, 
5 0H 
HC) = Flt) = Hy (to) 
l 


0H 
py(e"(t)) = ~ agi OW) = —Hyi (c(¢)), 


can then be written as 
(a) ¢'= J(DH)' 
where 
* denotes the transpose of a matrix, 
é denotes ((g' oc)’,...,(g" oc)’, (p10c)’,...,(pnec)’), 
(DH) denotes (Hgi oc,...,Hgn oc, Hp, oc,..., Hp, Cc). 
Since J* = —Joy, this can also be written as 
(b) —Jc* =(DH)*. 
Fora map f: T*M — T*M, the curve y = f oc satisfies 
(0) yt = (Df )é, 


where (Df) denotes the Jacobian matrix of f with respect to the coordinates 
q',...,q", P1,.-+, Pn. And for nonsingular Df, the map K = H o f7~! satisfies 


(DK)! = (Df~")'(DH)' 


(d) = —(Df ‘dé! by (b) 
= —(Df)'I(Df")y' by (c). 
Since J* = —J implies that J~! = —J, this can be written as 
(a’) y! = —(Df)'I7"(Df)(DK)' 
= (Df)'I(Df)(DKY, 
and the equations (a’) will have the same form as (a) whenever (Df) satisfies 
(A) (Df)'I(Df) = J. 


To interpret (A), recall that congruence of m x m matrices, B = C‘AC, is 
usually introduced in connection with the quadratic form associated with a 
symmetric matrix A. It may equally be applied to a skew-symmetric A = (aj;), 
to which we associate the skew-symmetric map A: R™ x R™ — R defined by 


Canonical Transformations 569 


A(e;,€;) = aj;, which can also be written as 


m 
A= s \- aij -@; Ae; . 
j=l 
Remark. Note, for later use, that if v and w are vectors (1 x m matrices), then 
A(w,v) = (the single entry of) wAv‘. : 
As in the symmetric case, for a linear 7: R”™ > R” with T(e;) = > Chi €k 
we have k=1 


n n 
A(T (ei), T(ej)) = D> ceicrj A(ek.€1) = > ceicrjans, 
k,l=1 k,l=1 
which is the (i, /) entry of the matrix C'AC, so that CtAC is the matnix that 
corresponds to A in the basis {T(e;)}. Writing the standard basis of R?” as 
€1,...,€n,€1,...,€n, the map corresponding to J 1s 


n 
5 ) e;* A é;*. 
i=1 


Translating this in terms of the dq’ and dp; on M,*, equation (A) thus tells us 
that )°, dq' A dp; = >, f*(dq') A f*(dpi), so that 
(A’) f° oe =o, 


which is precisely the modern definition of a “canonical transformation”. 


DEFINITION. A diffeomorphism f: T*M — T*M 1s called a canonical 
transformation if the map f*: T*(7*M) > T*(T*M) satisfies 


fro =o. 


Note that f~! is also a canonical transformation. Since f* also preserves the 
n-fold product @ A---A@ considered on page 512, f is orientation preserving, 
and in terms of coordinates (¢!,...,g”", pi,.--, Pn) around f(z), which we will 
often denote by (gq, p) for brevity, and (qo f, po f) around p, we have det f* = 1 
[equation (A) on the previous page already implies that (det f*)* = 1]. 

Having reached our elegant definition, we adopt the standard mathematical 
ploy of ditching the motivating considerations, and replacing them with a 
theorem. Recall, from page 533, that the solutions to Hamilton’s canonical 
equations for the Hamiltonian H are the same as the solutions of the vector 


field Xy defined by ¥ ce 
Hlo=- ; 


Such vector fields are called Hamiltonian vector fields. 


570 Chapter 19 


LEMMA. Let f: M — N bea diffeomorphism, X a vector field on M, and A 
a k-form on M. Then 
(F(X AA) = faX Af Y*A. 


PROOF. A straightforward unraveling of definitions. ¢ 


1. THEOREM. The diffeomorphism f: T*M -— T*M 1s canonical if and 
only if for all H 
tx(XH) = X Ho f-! = Xe H. 


In particular, if f is canonical, then f, always takes Hamiltonian vector fields 
into Hamiltonian vector fields. 


PROOF. The definition Xq 1@ = —dH and the Lemma give, respectively, 
(f-')* (Xu 1@) =—(f-')*(dH) = —-d(H © f~") 
(f~")* (Xa Jo) = fe(Xn)1(f')*0, 
so we have 
—d(H of) = fa(Xn)A(f")*@. 
If f is canonical, then f~! is also canonical, (f~!)*@® = @, so we have 
d(H of) = fXn)lo => fe(Xu) =Xpop-. 
Conversely, if fx(XH) = Xy.or-1 for all H, then we have —d(H o Pt) = 


Xo fa! _I(f—!)*@ and thus —dK = Xx _1(f7~')*@ for all K, while also —dK = 
Xx | @ for all K, implying that (f~')*o =o. % 


At this point, it might be nice to have a few concrete examples of canon- 
ical transformations f: T*M — T*M. Canonical transformations are often 
defined in terms of the corresponding change of variables: given coordinates 
(q, p) on the domain, we give a formula for the corresponding coordinates 


O=qof , dQ' = f*(dq') 
with . 
Li Prey dP; = f*(dpj), 
on the range. For example, in dimension 1, we can consider 
Q=q+pt+ zat, P=ptgt 
for constants g and t. Since 
dQ A\dP = f*(dq+tdp)” f*dp = f*(dq \ dp), 


this is a canonical transformation. Often, we don’t even bother writing the f*. 


Canonical Transformations 971 


For example, if we define, for constants w and 1, 
Q =qcosat + +psinat, P = —wqsinwt + pcosat, 


then 
dQ AdP = (cos wtdq+t+ 7 sin WT dp) A (—w sin wt dq + coswt dp) 
= cos? wtdgq Adp + sin*wtdq A dp = dq A dp. 


These examples might call to mind the equations for an object traveling in 
a parabolic arc and for the harmonic oscillator, leading you to anticipate a 
result from a later section. At the moment, however, the most important point 
about these examples is that the q’ and the p; are simply treated as independent 
coordinates, as we noted on page 534. For Lagrange’s equations, we can choose 
the transformation g' + Q! arbitrarily, but the total transformation (q',q') 
(Q', QO") from TM to TM is then completed determined if we want to end up 
with second order equations. But in the case of Hamilton’s equations, we can 
allow transformations from T*M to T*M that mix the g' and p;, though now 
we want to consider only those transformations that are canonical. The equal 
footing of the g' and p; is strikingly illustrated by the canonical transformation 


Q* = Pk; Pr = —q* 
for any k, which simply interchanges g* and p,;, with a sign change. In general, 
any coordinates a’, B; for which we have 


n 
wo = > dB; \ da 
f= 
are called canonical coordinates, with the 6; canonically conjugate to the a’. 
For one more example of a canonical transformation, which will appear in 
Problem 3, we instead give formulas for g = Qo f~! and p= Po f7!: 


L2P 
g=1/— sinQ, p=wv2oP cos Q. 
Ww 


Simple calculations again prove that they define a canonical transformation. 


As already pointed out, Theorem | actually only identifies transformations 
taking equations for the Hamiltonian H imto equations for the Hamiltonian 
H o f~!; the possibility that such equations are transformed into equations for 
other Hamiltonians K, depending on H, is seldom addressed. But, in fact, if f 
is a “generalized canonical transformation”, f*@ = aw for a constant a, then 
this is true for K = (1/a)Ho f~!, which can easily be seen from both our matrix 
proof and the proof of Theorem 1, and simply amounts to reparameterizing 
curves c(t) as c(at). The not-very-informative proof that these are the only 
other possibilities is given in Addendum B. 


972 Chapter 19 


Hamiltonian flows and integral invariants. In contrast to the previous sec- 
tion, where we studied the relationship between an individual transformation 
f:T*M — T*M and all Hamiltonian vector fields, we will now consider a 
specific Hamiltonian vector field X, and look at the 1-parameter group of dif- 
feomorphisms ¢;: T*M — T*M generated by X, as on page 538. 

If we have a point g € T*M, we can consider (a) the curve cp in T*M x R 
given by t + (¢:(#),¢), showing the action of the ¢; on gz. And if we have a 





curve y: [0,1] in 7*M, given by u + y(u), then (b) we can join all the images 
of the cy) to obtain a surface, which is parameterized by the map 


A(u,t) = (or(v(u)), £). 


In particular (c), if we start with a curve y; in T7*M = T*M x {0}, and follow 
all the c curves to time T, we end up with a curve of the form {(y2(u), T)} for 
some curve y2 in 7*M. 

This picture is the basic set-up for the “integral invariants of Poincaré”, and 
everything that we need to know about these invariants, with proofs complcte, 
will eventually be given in a few lines. But, emulating the presentation of the 
previous section, we will slowly sneak up on the slick method. ‘To begin, we will 
reach back in time to the venerable classic Whittaker [1] for a version that is, in 
its own way, pretty slick. 


First approach. Suppose our Hamiltonian H comes from a Lagrangian L. 
Then the various curves c are just the mage under FDL of curves c in TM 
satisfying the corresponding Lagrange equations, and A is just the image under 
FDL of a variation A of the curves c. Since all the curves ¢ in the variation 
are extremals for L, we can apply the Boundary ‘Term Corollary on page 462 
at each u, to obtain 


AAW) _ yr Ba AGHD) AL(AW)'O.1 ~ 


du ae Ou 0g! a 


where J(A(u)) is the integral of L over A(u) =tt> A(u,t) on (0, T]. 


(A) 


Canonical Transformations 573 


I,. THEOREM. If y; is a smooth closed curve, then 


[donde = [ Yemal 
yj i! 


i=] 


PROOF. We integrate (A) from 0 to 1. On the left we get 0, since we are 
integrating a derivative along a closed curve. Remembering that 0L/dg' = pj, 
we see that on the right the terms for t = 0 and t = T give, respectively, 


| dea and [ doeas 


i=] L=] 


The form 6 = >~"_, pj; dq' is called a 1-dimensional relative integral invariant 
of Poincaré. The “relative” refers to the fact that its integral over a curve is an 
invariant under the flow of the Hamiltonian only when the curve is closed. On 
the other hand, if we consider a disc with boundary curve y;, or more generally, 
a 2-chain D, with dD, = y;, and the corresponding D2 with dD2 = yz, then 


we feefon pele 


so that | ® = | @. en @ is a 2- Pee | pineal envanriant of Poincaré. 
D, D2 
In connection with ‘Theorem I,, Whittaker, like many another author, points 
out: “This is essentially the same as the hydrodynamical theorem that the 
circulation in any circuit moving with the fluid does not alter with the time.” 
Arnold [2; Chap. 9] provides a set of variations on this theme, and Kozlov [1] 
orchestrates a full-scale symphony. 


Second approach. Although this use of the Boundary ‘Term Corollary 1s in- 
triguing, there is a more natural proof that will also give a more general result. 
Recall that we used the equations at the bottom of page 531 to obtain the 
Hamilton-Jacobi equation. In condensed notation, these equations might be 
written ; 
(B) dS =) ° pidq' —H dt, 
cI 


which looks like an analogue of the equation on page 526 


3 
dV = Yi dx', 
i=l 


except that (B) is meaningless as stated, since dS is a 1-form on M xR while the 
right side is a |-form on T*M xR. However, given a variation A as on page 572, 


574 Chapter 19 


we can define S on the image surface (a) by letting S(A(u,t)) be the integral 
of L on [0,t] along the curve ¢ for which the corresponding c contains A(u, ft), 


Y1 


A(u, t) 





(a) 
and then (B) holds on the image surface. We can then apply (B) to the “tube” 7 


that we get (b) by finding solutions of Hamilton’s equations starting from a closed 
curve, and cutting it off along curves y; and y2. 


In. THEOREM. For any closed curves y; and yz surrounding a tube made up 
of solutions to Hamilton’s equations, we have 


[ Sirvag - Hat = | Dpidg - Ha. 
vide Yaoi 
PROOF. Letting # = >~"_, pj dq' — H dt = dS, we have 


at a aig hala? aoa cela 


V2-¥1 


Theorem I, 1s a special case, simce we then have t constant on y; and y2, so 
that the H dt term is 0. The 1-form )>;_, pi dq’ — H dt is called the integral 
wnvanant of Poincaré—Cartan (without bothering to add the “relative’’). 


Third approach. Although our proof of Theorem I, shows how the result 
arises from the relationship between Hamilton’s equations and S, we would like 
to have a proof that uses Hamilton’s equations directly, in line with our program 
of focusing more attention on the Hamiltonian equations themselves. (More- 
over, our current proof only works when H arises from a regular Lagrangian, 
not for general #7.) 


PROOF 2 OF THEOREM In. We have 


n 


| Yo pidgi—Hadt= [ Ypdgi—nae= [ d(Y> rag — Har) 


yi-y2 T=! a7 t=! o i=1 


Canonical Transformations 575 


It will be convenient to regard T*M x R as R?”"*!. At any p &€ M, consider 
the (2n+1)x(2n+1) skew-symmetric matrix A = (a;;) corresponding to A(z), 
as on pages 568-569. From the form of A, we see that the matrix A is 


2n, | 

J 'DH*\ 2n 
AS 

—DH : 0/ 1 


The Remark on page 569 shows that for tangent vectors v and w at p we have 
(x) A(p)(w,v) = (the single entry of) wAv'. 


Now the vector 


v= (Hp,,..-, Hp,, —H ., -Hgn, 1) 


gives 
is easily seen to be a tangent vector along the flow of the vector field X corre- 
sponding to the Hamiltonian H, so at each p € J it lies in the tangent space 
of 7. But we easily compute that 


Av‘ = 0, 
so that by (*) we have 


A(p)(w,v) = 0 for all w at p. 


Hence A(p)(w, v) = 0 for linearly independent w and v. On the 2-dimensional 
tangent space J, we therefore have A(p) = 0, for each p in T. % 


We have now shown in three different ways that on T*M the 1l-form 0 = 
>, Pi dq’ is a relative integral invariant. Hence, @ is an integral invariant: its 
integral over any disc is the same as its integral over the image of that disc under 
any ¢; of the flow generated by a Hamiltonian vector field. An approximation 
argument could then be used to prove that ® itself must be preserved, so that 


‘THEOREM. Each ¢; is a canonical transformation. 


In short, the flow of any Hamiltonian vector field gives a 1-parameter family of 
canonical transformations. 

Thus, at the end of all these clever manipulations with integral invariants, a 
bit more work would give a result that doesn’t mention them at all. Of course, 
it would certainly be much more convenient if we could go in the opposite 
direction: Knowing that each ¢; is canonical would tell us mmediately, without 
approximation arguments, that » is an integral invariant, and thus also that 0 
is a relative integral invariant. 


576 Chapter 19 


With a wave of the wand, here comes the proof. 


Hamiltonian flows and canonical transformations. ‘The claim that all ¢; for a 
Hamiltonian vector field Xq are canonical transformations amounts to saying 
that the Lie derivative Lx,, satisfies Lx,,@ = 0. We will want to use the 
formulas 


d(Lx4) = Lx(dA), 
LyAA py =LxydA AWt+tAnLyp, 
and the Cartan formula, or as it is dubbed in Marsden and Ratiu [1], 
LyX =d(X 1A) +X 1 da Cartan’s Magic Formula, 
and also take this opportunity to introduce, for later use, the less familiar 


[X, YJ jA=Lxy(¥ 1A) -Y I(LyA) Cartan’s Bracket Formula, 


all of which are reviewed in Problem 1. 


2. THEOREM. The flow of any Hamiltonian vector field consists of canonical 
transformations. 


PROOF I (Hogwarts version). For Xy satisfyng Xqy 1 @ = —dH we have 
Lx, @ = d(XH 10) + Xp Ido 
= d(-dH) + XyjI0=0. % 


PROOF 2 (Muggles version). We have 


n n n 
Liga, (> dpi \ aq’) = > Lx, api A dq’ + >— api A Lx, dq 


i=1 i=] 


> (Xu (pi)) A dq' + dpi \d(Xu(q')) 


t=] 
0H Sage 0H 
dq’ —]. 
jr : + Lanna (5) 


d {-—-—. 
ya (59 


i=1 





The first sum is 


” 02H 02H | 
—~——_ dg ndqi — ——-~dp; A dq 
»( agiagi 4 “94 ~ Gp agi @Pi ” ‘') 


ee 
n 
0*H . 
- 0 —~ \° ap; nq, 
i jay OPP" 
and we find similarly that the second sum is the negative of this. ¢ 


Note: ‘Theorem 2 immediately implies Liouville’s volume theorem. 


Canonical Transformations 577 


3. THEOREM. (Converse of Theorem 2). If M is simply-connected (or more 
generally, if H'(M;R) = 0), and all ¢; of the flow of X are canonical, then X 
is a Hamutonian vector field. 


PROOF. If all ¢; are canonical, then Ly@ = 0, so 
0=Lyw = X |\dwt+d(X Iw) =d(X Io). 


Thus X | @ is closed, and the hypothesis on M implies that it is therefore exact, 
so that we have X _|w = —dH for some function H. ¢% 


Without the extra hypothesis on M, we still have that X is locally Hamiltonian 
(i.e., Hamiltonian in a neighborhood of each point). For a simple example of a 
locally Hamiltonian vector field X that is not globally Hamutonian, consider the 
vector field on a torus with integral curves shown below. As we follow any of the 


wide 
CHS 


integral curves, H would have to increase, so it couldn’t be well-defined on the 
closed integral curve. Or, simply note that H would have to have a maximum 
point on the torus, and there X would have to be 0; this argument works even 
when the integral curves wind around the torus forever, at an irrational slope. 


Generating functions. ‘Jo specify a canonical transformation f: T*M — T*M, 
we would seem to need 2n functions of 2” variables. On the other hand, the 
fact that f is canonical gives 2n relations between these functions, so we might 
expect to need only one. 

In fact, if we have a canonical f: T*M — T*M and a canonical coordinate 
system (q, p), then for the functions OQ = qo f and P = po f we have 


n n n n 
> dpi Ndq' =) dP; \dQ' => a(don dq - > P, 40’ = 0, 
i=l i=] i=] 


i=1 


and hence locally there is a function $: T*M — R with 
(A) > pidq' — Y)P: dQ! =d5. 
i=l i=1 


Sometimes 5, which 1s only determined up to a constant, is called a generating 
function of f, but the classical notion of a generating function treats $ in a par- 
ticularly clever way, in order to disentangle the (¢, p) and (Q, P) coordinates. 
Our considerations are local, so we regard f asa map f: R” xR” — R” xR” 
(shrinking down the region that R” x R” represents as needed for any additional 
assumptions to hold). 


578 Chapter 19 


Suppose that for our canonical transformation f we have the following 
A condition on the Jacobian matrix, 


DANGEROUS 
CURVE 


an 0 # det EP) — 


09, OG, P)) 
t aa) an ne: 
d(q, P) 
so that knowing q for a point in T*M and Q(q, p) for its image under f 
determines p uniquely [as trivial examples, in dimension 1 this is true for 


t(q, p) = (q, p + q), but not for the identity function f(q, p) = (q, p)]. This 
condition implies that the map 


xy: R”® x R” > R"” xR” defined by y(q, p) = (q, O(9, p)) 


de 


is a diffeomorphism, and thus (qg, Q) is a coordinate system on R” x R” (no one 
is saying that it is canonical). 

When (Ji) holds, so that (¢,Q) is a coordinate system, equation (A) on 
page 577 can simply be summed up by the equations 

0S 0S 

(1) agi =o 30F =-—P; in the (q,Q) coordinate system on R” x R”. 
In order to relate these equations to the standard (q, p) coordinate system, we 
define the “type 1 generating function” S;: R” x R” — R by 
(Fi)  Si(q,O(q,p)) = §(q, p), ov, more precisely, S$; =Soy!. 
Then the formal definition of 0/dqg' and 0/dQ' translates equations (1) into 


D;)S\(q, O(@, P)) = Dis Dn+iSi(q, O(G, p)) = —Pi, 


where D; denotes the partial derivative with respect to the j" variable. This 
can also be written in conventional notation as 


0S 0S 











Pia) 7G. 2G.p)) =p orbriefly FG. Q) = a 
aS aS 
Pib) 55 7(4,0@.p))=—Pi or briefly 357@.0) = —Pi 


where 0/dq' and 3/00' are now just being used as convenient abbrevia- 
tions for D; and Dy+;. It will be clear when they are being used this way 
because the argument will be (¢g, O(q, p)), or more elliptically (¢g, Q). 
In some contexts we might simply denote 0/9Q' by 0/dp;, as when we observe 
that (F1-a) umplies 
0 


5? = ee a ) 5) = , a jo. ) ) =a ) 5 


Canonical ‘Transformations 579 


so (07S1/dqdp)(qg, O(q, p)) and (dQ/dp)(q, p) are inverses of each other, and 
in particular (07S) /dqdp) is non-singular at the points of interest to us. 


The name “generating function” refers to the fact that we can go in the other 
direction, and use a function S: R” x R” — R satisfying 


S 
=) 
dp 
to define, or generate, a canonical transformation. In fact, this condition im- 


plies, by the implicit function theorem, that there are functions Q': R” xR” >R 
satisfying the equivalent of (F1-a), namely 





92 
det 
e ag 


0S 
(3) a7 (4, OG: P)) = Pi- 
q 
The calculation (2), now applied to (3) instead of (F1-a), shows that we have 
, 92: P) 
4 0. 
(4) ae 


This means, once again, that we can also use (g, Q) as a coordinate system, 
and then (3) can be written as 


0s 
@) ag 


We now define P; by the analogue of (F1-b), 
0S 
P; = —_ (4, O(4, P)); 
Pi 
so that we also have 
0S 
(b) 
dO 


Using (a) and (b) we now compute, zn the (q, Q) coordinate system, that 


pid — P;dQ' = = (quae + 363 i 22") 


=d58, 


= Dj in the (q, Q) coordinate system. 





= —Pf; in the (q, Q) coordinate system. 





and taking d of this equation shows that the function f = (Q, P) is canonical. 


In addition, the argument on the previous page showed that (4) = (Ji) umplies 
that f has a generating function Sj, satisfying equations (Fj -a) and (Fy-b), which 
are the same as (a) and (b) for S, so that S = S;+ constant, and we can conclude 
that our given S is in fact a type | generating function for the transformation 


ft =(Q, P) that we have just defined. 


980 Chapter 19 


Though condition (Ji) can easily fail to be true, we might have instead 


dP(Q, P) dq, P@, P)) 
2 0 ~ det ———— = det ———— 
uP Op d(q, P) 
(which does hold for the identity map). In this case, g and P(q, p) determine p, 
so (q, P) is a coordinate system, and we have the diffeomorphism 


W(q, P) = (4, P@, P)). 


In order to find the partial derivatives 0S /dq' and 0S /dP;, we use the relation 
d(P;O') = P; dQ' + O' dP; to write equation (A) on page 577 as 


n n n 
a(s ae yr") =) pidq'+ > O' dP;. 
i=l i=1 i=l 
If we write the left side as dS, then we have 
as as 
agi — Pi; api OQ,» 


and defining the “type 2 generating function” Sz by 


(Fo) S2(q, Pq. p)) = Sq. p) +), PiO'(G,p), thatis, S,=Soyp, 
i=] 


we end up with 


a 


(F2-a) P(q, P)) = Di 





as | 
(F2-b) 85 Pq, p))=Q. 


From (F2-a) we also obtain the analogue of — (2) on page 578, 


0 fos 
(2) i= 5-(F = (q, P P(a.p))) = ae 


Op; 





igi (q, P@, Ps 4 p). 

Just as with type | generating functions, the condition oaks /dqdp) # 0 
can also be used to generate a canonical transformation by means of a type 2 
generating function: First, we choose functions P;: R” x R” — R satisfying the 
equivalent of (F2-a). We next note that applying the calculation (2’) to S shows 
that (oP i dp) is nonsingular, so that we can use (qg, P) as a coordinate system. 
And we then define the Q' by means of equation (F2-b). We leave it to the reader 
to check that we now have )~"_, pi dq' —P; dQ' = d(S ee P; O'(q, p)). 


Other types of generating functions need not be considered until later. 


Canonical Transformations 581 


Time-dependent canonical transformations. Before examining the use of gen- 
erating functions, we want to extend our results not only to Hamiltonians 
H: T*M x R — R that depend on time, but also to the more general sit- 
uation where the canonical transformations themselves may depend on time. 
That is, we want to consider maps g: T*M x R > T*M xR of the form 


g(p.t) = (F(e,1),0), for p € T"M, 

with each pt > f(p,t) being canonical (e.g., the 1-parameter family of canoni- 
cal transformations generated by a Hamiltonian vector field). The big difference 
in this more general case is that the new Hamiltonian will not simply be Hog™’. 

Among the different possible approaches to this question, we choose one that 
requires no additional abstractions, though this will not relieve us from having 
to contend with somewhat fussy notation. Given any f: T*MxR—> T*MxR, 
for each t we will let ff: T*M — T*M be the function 


t(p) = f(p,t), 


so that we have a 1-parameter collection of maps from T*M to T*M and saying 
that 2 + f(p,t) is canonical is equivalent to saying that all ‘f are canonical. 
Locally, g wil be now given by a collection of 2 + 1 coordinate functions 


O'(p,t),...,O"(p,t), Pi(p.t),..., Pn(p,t), and t, 


and for each fixed t we have corresponding functions ‘Q',..., 'O”, ‘Pi,..., "Pn, 
which are the coordinate functions for ‘f. 
(‘To go along with this notation, we will allow q', p; on T*M to be confused 


with coordinate systems with the same names on 7*M xR, so that we can write 
q'(p,t) as well as q'(p) and pj(p,t) as well as p;(p); 
then for all ¢, the functions ‘gq’, ‘p; are the same coordinates q', pj; on T*M_] 
The fact that all ‘f are canonical means that for each ¢ 


> (pi) A dq’) = > d(Pi) A d(Q') 
i=] [=] 


so that 


a(> Di /\ d(‘q') = ye /\ aco')) = 0: 
i=1 i=] 
and thus locally 


(1) > Pi dq’) — DP: d(Q') = ds, 
i=] i=] 
for some function 4;: T*M — R. We can put all the 4; together into a function 
§5:T*MxR—-R defined by S(p2,t) = 4:(p) 


(so that, officially, +, = ‘S). 


982 Chapter 19 


After all this notational kanoodling, we next want to note that if D,+1 denotes 
the partial derivative with respect to the (n + 1)* variable, then we can write 


dQ' = d(Q') + Dn4i(Q') dt, 


where this equation means that when we write dQ'(p,t) in terms of dq'(p, 1), 
dp;(p,t), and dt(p,t), the coefficients in the dq', dp; part come from d(‘Q'), 
while the coefficient of dt is Dn+i1(Q')(p,t). By abuse of notation we will 
simply write 
; ; a0! 
dO' =d(‘O')+ =“ dt 


It follows that 

n n n agi 
y P;dO'=N P,d(O')+ SP; = at. 
(2) d, ; dQ d, (0') dX a 
We likewise have 


0S 0S 
—s t _— a —— 
dS =d(‘S)+ 5 dt =d4,;+ yy dt, 


so (1) can be written as 


fe : 0S 
,dq' —Y P,d(0') =dS —-—d 


i=] i=] 


where we keep the ‘Q! to remind us that d(‘Q') is computed separately for each 
fixed t. Substituting (2) into this equation we then obtain, at long last, 


(A) Y pidq' — > PdQ' =ds - ae oi + ola 
i=] i=1 





4. ‘THEOREM. Let g: T7*M xR — T*M xR be a canonical transformation, 
and suppose that the curve c : R — T*M x R satisfies the canonical equations 
for a Hamiltonian H: T*M x R — R. Then the curve g oc satisfies the 
canonical equations for the Hamiltonian K: T*M x R > R given by 


(B) kop=a + [Dae + FI 


Ot 





Canonical Transformations 583 


PROOF. The “right” proof (or perhaps we should say the canonical proof), 
would be to generalize ‘Theorem | appropriately to T*M x R. But empathizing 
with physicists, we wil resort to a much easier elementary approach. 

n 


a 


Setting 0 = 2 pi dq’, we find that if K is defined by (B), then equation (A) 
i=l 
can be written as 
6—2*0=d5—(Kog—-A)adt 


and thus as 


6—Hdt=g"*(0—-—Kat)+d5S. 
For a curve c: [a,b] ~ T*M x R, this shows that 


[6 —Hdt — | 6 —Kdt = the constant §(c(b), b) — §(c(a), a). 
goc 


So the stationary curves for 6 — K dt are the images under g of the stationary 
curves for 6 — H dt. But the extended Hamilton’s principle (page 534) says that 
the stationary curves for 6 — H dt are the curves satisfying Hamilton’s equations 
for H, while those for 0 — K dt are similarly the curves satisfying Hamilton’s 
equations for K. ¢% 


Our discussion of generating functions is easily adapted to the generalized 
case. Again working locally, we consider g : R” x R” x R > R” x R” x R. We 
now suppose that g has the property that knowing q, Q, and ¢ determines p; 
the condition for this will be 





= d0(q, pt) d(q, O(, p,t),t) 
0 ~ det —————_ = det —_——___—_-. 
Ji) ap aq, P.1) 
We define the “type 1 generating function” S;: R” x R” x R — R by 
(F1) Si(q, O(, p.t),t) = $(@, p,t), 
and find that 
(Fy -a) t),t) = pi 
(F1-b) ——(q, O(g, p,t),t) = — 


a 
Applying (F1-b) to equation (B) on the previous page a 


a1 AQ! 
Kog=dH — 
7 » dO at 7 








984 Chapter 19 


But taking 0/dt of (Fi) gives 
dS 9S) yy 9S, 00! 
_ 81, y IS 00 








ot at | AQ! At’ 
and we thus obtain 
(Fi-c) Koga H + “1 


If instead of (J 1) we have 


~ OP(q, p,t) 0(q, P(q, p,t),t) 
0:+:déet———. = det 
Jo) 7 ap i.) 


and we define the “type 2 generating function” S2 by 


(F) So(q, Pq, p.t),t) =S(q.p.t)+ ), PiO'(G, p.t), 


i=l 
then as before we obtain 


Bg 0S 
(F-a) agi 4 P(q, p.t),t) = pi 


es 0s 
(F'-b) == (q. P(q, p.t),t) = Q'. 
OP; 


Moreover, taking 0/dt of Fo IVES 


2 0S.0P, OS IAS 
o ae tH LPO’, 














SO 
“.. 00! 98 
Kog=H P;—+— 
6 p> Ot sy ot 
: P; se | 
OOF & pee ye 8 Fi! 
t OP; ot Ot * 
a "~ OS> OP; OP; 


—H Sees ye! 
ae Or ss aS OP; ot = ot ~ 


and (F2-b) then gives 


(Fe-<) Kog=H+ 2 


In both cases, the reverse direction discussed on page 579 also generalizes in a 
straightforward way. 


Canonical Transformations 585 


Using generating functions to simplify Hamilton’s equations. After all this to-do, 
we recall that we first considered canonical transformations as a way of trans- 
forming Hamilton’s equations for the Hamiltonian H into a new set having a 
Hamiltonian K that is easier to solve. Generating functions promise to make 
this simpler, because the use of generating functions reduces the problem to one 
involving a single function of 2n variables (you can probably see where this 1s 
heading). ‘The ideal situation would be to have K = 0, since the canonical 
equations for the Q! and P; would now give Qi =a 0 P,, with solutions 
PG. Q' = b; for some constants a; and b;. 

Consider first a type 2 generating function, which we will simply call S$, with 


~ dS(q, P,t) 

= ena 

F2-b ——__ = Q 

(F'2-b) oP, O cup 

sane 94, i 

(F2-c) Koga H+ hi) 

In order to have K = 0, the generating function S must satisfy 
dS(q, P,t) 


Ot STG ice” Dives Peat) =O 


substituting from (F>-a), we obtain (ah-hah!) the Hamilton—Jacobi equation, 


P P 
OO) a ae — 0), 
at aq? aq” 


More precisely, what we have here is the Hamiulton—Jacobi equation with n 
parameters, P;,..., Pn, which we want to be the constants aj,...,a,. This 
means that solving this equation is the same thing as finding a complete integral 
o(q',...,q",@1,...,@n) of the standard Hamilton-Jacobi equation. There is, 
in fact, at least one complete integral, namely S(q, P,t), since J) shows that 
det(07@/dq/ dax,) # 0, though of course that doesn’t necessarily mean that we’ll 
be able to find one by separation of variables. 

If we do succeed in finding a complete integral ¢, then the desired equations 


Qi = b; become, by (F2-b), 


0 
te eee ee 
0a; 
and (F-a) then gives 
Yds Op 1 n 
p= eC mee ers | MAG caceb nuh) 
dg 


Thus, we end up with exactly the same set of equations as those in the statement 
of Jacobi’s theorem, so that in this context, canonical transformations merely 


586 Chapter 19 


provide an additional route to Jacobi’s theorem (and a more complicated ex- 
planation of how to use the Hamilton—Jacobi equation). 

The real usefulness of canonical transformations for theoretical work will ap- 
pear in the following chapters. For the moment, we simply emphasize that a 
type 2 generating function gives a canonical transformation (q, p) + (Q, P) 
for which the equations Q' = b; give solutions to Hamilton’s equations. 


For a type 1 generating function, where we have 


(Fi-a) ae ~ Pi 
(F1-b) eae D Say, 
(F1-c) Kogan + S42) 


things just get shufHled around a bit. We still end up with the Hamuilton—Jacobi 
equation, 


dS(q,Q,t) +H (a' gq” dS(q, Q,t) dS(q,Q,t) ‘) — (0) 


Ot oq! a dq” 
but now the Q’ are the parameters, which we want to be the constants bj,..., bn 
and we consider a complete integral o(q!,...,q",b1,..., bn) with (F’,) showing 


that S is in fact a complete integral. 
If we succeed in finding a complete integral explicitly, the desired equations 
P; = a; then become, by (F'1-b), 


0 
aC yeeeeG 5 01,..-,0n,t) = aj 


and (F1-a) then gives 
= . 
Pi = a 7a gaa (OTs ag Ons bs 


Analogous to the situation = a type 2 generating function, if there is type 1 
generating function, then there 1s a canonical transformation (q, p) +> (Q, P) 
for which the equations P; = a; give solutions to Hamilton’s equations. 


Generating functions in the time-independent case. In Chapter 21 an extremely 
important role will be taken by a modification of this approach for the case of 
a time-independent Hamiltonian H: T*M — R. 

Instead of obtaining a Hamilton—Jacobi equation in which ¢ docs not ex- 
plicitly appear, and then writing S(q,a,t) = W(q) — at, H(q,dW/dq) = a, 
leading us to solve the Hamuilton—Jacobi equation for the characteristic func- 
tion W(q,a,a2,...,@n), a more direct approach is simply to look for W right 


Canonical Transformations 587 


from the start. For example, the equations (Fo-a,b) for a type 2 generating 
function S not depending on t, which we will now denote by W, are 


OW(q,P) OWq,P) 
ee Se 

dq’ OP; 
which are simply the equations (F'2-a,b) from before, while (F2-c) gives us the 
equation Kog=H+0=a. 

We obviously can’t make K = 0 now. Instead, when dW(q, P)/dq' is sub- 
stituted for pj in H = a, we get the Hamilton—Jacobi equation for Hamilton’s 
characteristic function W(q, P), 


a(a. oW(q, ~) = 


dq 

which we solve as W(q,@,@2,...,Q@n) = +--+: , and then K og = H means that 
when we write K in terms of the coordinates (Q, P), as K(QO, a, Q2,...,Qn), We 
simply have K(Q,@,@2,...,@n) = @. So in this coordinate system Hamilton’s 
equations become (in condensed form) 

OK 0K ma 0K 

P; = ——~- =0 f-— =] =0 i=2,..., 

v aQ/ ) QO Jor ) = 80; Y n, 


with the solutions P; = a;, and O'=t+b! but Q! = b' fori =2,...,n 


[As a fairly trivial example, by comparison with the treatment of the harmonic 
oscillator on pages 550-551, we now solve the Hamilton-Jacobi equation for W, 


12 
obtaining, as before, W(q,a) = w / — —q?* dq, and we then get 
ss a) : [= 
is = qn)? 


More generally, suppose we simply have a generating function W(g, P) with 


OW(q, P 
(=) a(4¢, 2) = Ker) 


for some function K that does not depend on Q (Problem 3 gives a very simple 
special example where this happens, but the real use for this case occurs in 
Chapter 21). ‘Then Hamilton’s equations for the new Hamiltonian K will simply 
become dK aK 
P; =-~; =0 = P; =a; for constants a;, Q' = aa 
Cj 


dQ/ 


t+b=Q'= 


giving the same result.] 


588 Chapter 19 


which can be solved, in condensed form, as 
: : OK 
O'(t) = Q'(0) + ta (os. - +5 On) 


a Q' (0) + tvj(o1,...,Qn), Say. 


In other words, in the coordinate system (Q, P), every solution y of Hamilton’s 
equations will have constant P;(y(t)) = a, while the Q'(y(t)) will all be linear 
in t, with the coefficients of t determined by those qj. 


Naturally, all these considerations hold, mutatis mutandis, for type 1 generating 
functions. 


Despite the apparent efficiency of this approach, and its importance later on, 
we will often find it conceptually simpler to work directly with the formulas for 
time-dependent Hamiltonians first, and then, when necessary, specialize to the 
case of the time-independent Hamiltonian. 


Other types of generating functions. The need to consider both type | and 
type 2 generating functions 1s easily seen from the 1-dimensional example of 
the harmonic oscillator, with gq = cost, p = —sint; although these equations 
simply parameterize the unit circle, g can’t be used as a coordinate system in a 
neighborhood of (1,0) or (—1,0), while p can’t be used in a neighborhood of 
(0,1) or (0, —1). 

Problem 4 gives the standard ways to define a type 3 generating function of 
the form S(p, Q) when we have the condition 


on et OG, P) 


(J3) # 0, 


and a type 4 generating function of the form S(p, P), when we have 


OP(, Pp) 


(Ja) det ig 


#0, 


but it 1s easily seen that none of the conditions (J;)—(J4) 1s satisfied for the 
canonical transformation in dimension 2 given by 


QO! =p, P, = -q' 
Q* = @q’ P2 = p2. 


Another useful device is to look for a generating function for the inverse of 
a given canonical transformation, which will be used in Chapter 22, but that 
won't help here, either. 


Canonical Transformations 589 


In practice, type | and type 2 generating functions are the only ones ever 
needed for solving mechanics problems that actually arise, but that obviously 
won't satisfy compulsive pure mathematicians. Fortunately, for the canonical 
transformation just given we have 


0(Q', P2) 


det ——_- : 
O(P1, P2) 


and we can define a generating function that 1s a mixture of types | and 2, 


S(q',Q',q", P2), 


leading us to ask whether for every canonical transformation in 2n dimensions, 
there is, at each point, always such a mixture. In other words, we want to know 


whether, for each point, it is always possible to write {1,...,} as the disjoint 
union {],...,n} = &g U &p, in such a way that 
d(q,0°%,P 0(Q%, P 
ot det 1.288) ge, AO Ps) we Ho, 
dq. P) Op B ¢ dp. 
2n Xx 2n nxn 


The answer is that we can indeed always choose such £9, &p [type 3 and 
type 4 generating functions are in fact virtually never used], and the proof 
has nothing to do with physics, or even with T*M. It follows from purely 
abstract considerations about skew-symmetric functions on vector spaces. So it 
is fitting that we leave this question to be answered in the next chapter, where 
we will finally allow ourselves to indulge in one of the pure mathematician’s 


guilty pleasures. 


990 Chapter 19 


ADDENDUM 19A 


TIME-(IN)DEPENDENT 
HAMILTONIANS 


Given a time-independent Hamiltonian, for which we have the energy inte- 
gral (E) on page 529, we can obtain an equivalent system on a space of two 
lower dimensions, although the new Hamiltonian will depend on time. ‘This 
is a Classical construction that can be found in Whittaker [1]; §141], originally 
published in 1904, back in the olden days when the differential of a 1-form was 
still known as the “bilinear covariant”. As we will see later, it is also possible 
to reverse this process and formally reduce a problem for a time-dependent 
Hamiltonian to one with a Hamiltonian that does not depend on time. 

The sketch of the classical construction that we will be giving here follows 
Arnold [2; §45B], and amounts to a more geometric version of Whittaker’s 
proof. An alternate proof may be found in Abraham and Marsden [1]; pg. 391]. 

Our considerations are local, and for simplicity we simply assume that we are 
in M = R” xR” with coordinates (q, p). We also consider M xR = R” xR” xR, 
with coordinates (g, p,t), and the projection 


tmu:MxR-— MM, tu (p,t) = p. 


We first note that for any 2-form y on a manifold N of odd dimension 2n + 1, 
at each point p# there is a tangent vector v such that 


wi(v,w) =0 for all w at p. 


Proof: the matrix A of w is skew-symmetric, Ab = —A, so 
det A = det At = det(—A) = (—1)*"T! det A = —det A, 


so det A = 0, and A has an eigenvector v with eigenvalue 0. 

If w is nonsingular, then these vectors v lie in a 1-dimensional subspace, the 
characteristic subspace. In this case, a curve c : R — N with non-zero tangent 
vectors c’(t) is called a characteristic curve if c’(t) is always in the characteristic 
subspace, noting that such curves are determined only up to reparameterization. 

In particular, consider the 1-form 


n 
6—Hdt=)_ pjdq' —Hadt 
i=l 
on M xR, and the differential of this 1-form, denoted by A in the proof at the 
bottom of page 574. ‘This proof showed that the characteristic subspace for A 
at a point 1s precisely the set of vectors at that point that are tangent to the flow 
of the vector field X corresponding to the Hamiltonian H. And this means 


Time- (In) Dependent Hamiltonians 591 


that if a characteristic curve y is reparameterized by f, so that it is of the form 
(c(t), t) for a curve c in M, then c is a solution of Hamilton’s equations for H, 
and conversely. 

Now suppose that (g, Pp) is not a critical point of H, so that dH(q, p) is 
non-singular. Renaming the coordinates, if necessary, we can assume that 


0H oe) 
~~ (4, B) #0. 
Pn 





If H(q, p) = h, then by the implicit function theorem, locally we can find a 
function f with 

TG tahG (Digan Pata (A iocend = ,—q", Pi,---, Pn—1)) = h 
n—1 


[the —q” is intentional]. Letting (g, p) stand for (q',...,q"~!, p1,..., Pn—1), 
and renaming qn as —T, we can say there is a function K(q, p,t) with 


H (g,-t, p, KG, p,t)) =h. 


1 


We then have 


n n—1 
\) pidq' -Hdt =) pidq' -Kdt—Hdt 
i=l i=1 
n—-1 
=) pi dq —Kdt—d(Ht)+tdH, 
i=1 
and taking d of this equation we get 
An = 1m’ (Ax) —d(tdH) 
= mty"(AK) on the subspace where H = h. 


In the figure below, (a) shows the set E, = {(q, p) € M : H(q, p) = ht}, 
together with its extension E, x R C M x R. (Unfortunately, since we can 





at best draw pictures of 3-dimensional objects, the figure is rather misleading, 
because Ey 1s usually a submanifold of dimension > 1.) 

Now suppose (b) that we have a curve y(t) = (c(t), t) where c 1s a solution 
of Hamilton’s equations for the Hamiltonian H, and we project y back down 
to the curve y = my cy in Ey. The original curve y is a characteristic curve 


592 Chapter 19. Addendum IGA 


for Ay, so y 1s a characteristic curve for Ax, and thus, when reparameterized 
by T, it satisfies Hamilton’s equations for the (t-dependent) Hamiltonian K, 


OK ‘ OK g.. 34 


27 _ 
= aD;’ Piri ~ agi? a dt 

As one interesting consequence of this reduction, one can now use the 
extended Hamilton’s principle (page 534) to give a rather more direct proof 
of Maupertuis’ form of the Principle of Least Action (page 464), which the 
reader may work out, or find written out in Arnold [2; §45D]. Note that the 
additional assumption that (q, p) is not a critical point of H with which we 
have been working is precisely the additional assumption that we found neces- 
sary to add at the very end of our argument for the Principle of Least Action, 
on page 465. 


Starting with a Hamiltonian H that depends on time, we can formally reverse 
this whole construction by letting ¢t be a new space variable. We consider 


M=MxRxR with coordinates (q\,...,q",t, Di, +--s Dns Pt)s 
equipped with the symplectic 2-form 
WO=w+dp;Adat and the Hamiltonian H=H+ Pt- 
For a curve T +» cC(T) € M, Hamilton’s equations for H are 


., oH . dH : ; 
q i. t= 1, Pt= 


- _ 0H ._ d 
— Opi’ dq’ 


Or? dt 





The first three of these equations show that the solutions for this (t-independent) 
system project down on M x R to the solutions for the time-dependent system 
(M,o, ff). 

Thus, in theory, it 1s possible to consider only Hamiltonians that do not de- 
pend on time. However, such an approach does not necessarily make things 
any easier to understand! 


It should be mentioned that there are apparently even more general versions 
of this procedure that sometimes prove to be useful. See Cordani [1; pg. 398]. 


Generalized Canonical Transformations 593 


ADDENDUM 19B 


GENERALIZED 
CANONICAL TRANSFORMATIONS 


The fact that generalized canonical transformations also preserve the Hamil- 
tonian structure of equations, and that these are the only such transformations, 
was first pointed out and proved in Lee [I], where it is derived from a theorem 
about integral invariants. In our discussion of (relative) integral invariants, it 
should perhaps have been pointed out that any particular Hamiltonian H will 
generally have many other such invariants. The point about @ and @ 1s that 
they are “universal” integral invariants for all H, as, of course, are the higher- 
dimensional 6 Aw, @ Aw, etc., and Lee proved that the only universal invariants 
are constant multiples of these, from which the result about generalized canon- 
ical transformations is easily deduced (Problem 5). Here we give a direct proof 
of the result for generalized canonical transformations. 


LEMMA. CtJC=J <> CJCt=J. 


PROOF. CtJC = J implies C7! J~!(C')~! = J“, and since J7! = —J, this 
gives 

clg(cty'=J. 
Multiplying by C on the left and Ct on the right then gives J = CIC. % 


5. THEOREM. If f: T*M — T*M takes all Hamiltonian vector fields into 
Hamiltonian vector fields, then f 1s a generalized canonical transformation. 


PROOF. Given a Hamiltonian K on T*M, let H = K o f, so that 
(DH)‘ = (Df)'(DK)* (first line of equation (d) on page 568) 
and for a curve c in 7*M let y = f oc, so that 
cf =(Df')y' (equation (c) on page 568). 
Then 
cf — J(DH)‘ = (Df )y' — J(Df)* (DK)*, 
so we have 
(Df )(¢* — J(DH)') = y' — (Df) J (Df) (DK) 
= y' —J(-J(Df)I (Df)')(DK)' 
= p*—JP(DK)*, say. 


594 Chapter 19. Addendum 19B 


Since by hypothesis f takes curves satisfying ¢* — J(DH)* = 0 into curves y 
satisfying Hamilton’s equations for some Hamiultonian, it follows that for all K, 
the product P(DK)* must be (D K)* for some function K, so that 


eye reas 


Using equality of mixed partials of K we find that 








y OP; x ee -y OP ix a 07K 
Ox; a ii ~ Ox; ae. Pik OxjOXK 


Choosing K for which the second partials are 0, we see that we have separately 








A) “ OPix OK — <5 OPix OK 
Ox; ax, | OX; axK 
k=1 
and thus also separately 
2n 2n 
07K 07K 
2 P3.——— = Pix=———. 
2) a o OX; OX oD - OX; OX, 
k=1 k=1 
From (2) we can see that P;; 4 0 only for i = j, and moreover that all P;; = a 


for the same function a. ‘Then from (1) we can see that a is a constant. So 
P =alyn, or 


—J(Df)I(Df)* = alo, => (Df )J(Df)* = Jalgn =aJ, 


which, using the Lemma, is the condition for f to be a generalized canonical 
transformation. 


Remarks. ‘This proof comes from Giaquinta and Hildebrant [1; Chap. 9, §3.1]; 
a proof may also be found in Pars [l; §25.4], while the more general version 
for time-dependent Hamiltonians 1s considered in Siegel and Moser [1]; several 
more recent texts also have proofs. All the proofs, including Lee’s, seem to 
involve similar manipulations. 


Canonical Transformations 595 


PROBLEMS 
1. Recall that if {¢;} is the flow of X, we define LyA for a k-form A by 
es, tl f 

LxA = a yon A Ms 
It is easy to see that LyA is also a k-form. If A is a O-form, 1.e., a function f, 
then Ly f = Xf =df(X). 
(a) Show that 

Lx(Ai + A2) = LedAi + Lxdo 
Lx(fa)y=Xf-Af-Lxa 
Ly(A A pb) = Ly¥AANPH+AALYL. 

(b) For 


Lx (dx') = lim 7[(¢n")(dx!) — dx 


d(x! Ph) ; 
= jim 7 [> axJ x) = de | 


J=1 
the coefficient of dx/ is 


kim 
Paley | 


A(x’ ogn) — A(x! ogo) 
— Oxt x 

Show that this equals 

a7 lim -((x! on) — (x! ogo)} 

Hint: Consider the map A(h, q) = x'(¢n(q)) from R x M to R. 

(c) If X = °?_, a’d/dx', then 

da! 

1 Ox J 





n 
Lydx' = = 


(d) Conclude that Ly (dx') = d(Lxx' ' and then that in general 
LxdX = d(LxA). 
(e) For a k-form (A, the definition 
X JA(X2,...,XK) = A(X, X2,..., Xx) 
is naturally extended to mean that for a 1-form A we have X _|@ = w(X), and 
we also set X 1 f = 0 for a function (0-form) f. Check that Cartan’s Magic 
Formula holds for the 0-form f, and for the 1-form dx’, and conclude that it 


holds for all k-forms. 
(f) Prove Cartan’s Bracket Formula similarly. 


596 Chapter 19 
2. (a) Determine the canonical transformation produced by the type 1 gener- 


ating function 
n 
SS 7. 
i=l 
(b) Same question for the type 2 generating function 


n 
S= Sq’ Pi. 
i=1 


(c) For the type 2 generating function 


n 
S=) AG. Pi, 
i=l 
show that Q! = f;(q, t), so that we can obtain an arbitrary function f: M > M, 
and that the canonical transformation is simply g = f*: T*M — T*M. These 
are sometimes referred to as “point transformations”. 
(d) Check that not only does g preserve @, but in fact g preserves the 1-form 


n 
= YS Pi dq’. 
i=] 


Such transformations are sometimes called “homogeneous canonical transfor- 
mations’’. 


Proving this directly can be confusing, especially if one wants a coordinate-free 
proof. It also turns out all homogeneous canonical transformations are point 
transformations. For these matters, see Abraham and Marsden [1; Th. 3.2.12 


and Ex. 3.2F] or Marsden and Ratiu [1; §6.3]. 


3. (a) Show that the type | generating function 


1 
S(q,Q) = 50g cot O 


gives the canonical transformation on page 571. 
(b) Show that this transformation changes the Hamiltonian 


i 
H(q, p) = 5 (P° + w*q’) 


for the harmonic oscillator to 


E 
H=oP = P=—, 
w 
so that = 
_ 0H 
OP 
equivalent to the solution on page 55! found in Chapter 18. 





E 
=O; = = sin(wt + bd), 


Canonical Transformations 597 


4. Because of equation (F,-b) on page 578, the type 2 generating function S2 
may be thought of as the negative of the Legendre transform of the type 1 
generating function S;: 


0S 
~(q, Q)Q' — Si(q, Q) 





L(S1)(q, P) = 
, ae 


- (> P:Q'(q, P) + Si(q. 0)) 7 -(sa.0) ay. r0'). 
i=] i=] 
(a) Similarly, when we have the condition (J3), use the Legendre transform 
“OS, oo 
£(Si)(p, Q) = >> age? ~ 514.9) = >— pig’ — Si(q, Q) 
i=] 


i=] 





to define a type 3 generating function 


S3(Q(q. p), P) = 84. P) + >» pig’ 


i=1 
and show that 
0S3(p, Q,t) = —q! 


OD; 
0S3(p, Q.t) ip 
agi i) 
0S 
Kog=H + re 


(b) When we have the condition (J4), use a double Legendre transformation to 


define P : 
Sa(p, Pq, p)) = 8g. pP) + >, PQ’ — > pid’, 


i=] i=] 


and show that 


OSa(p, P,t) ee eee 
OD; lL» 
OS4(p, P,t) = gi 
OP; 
0S 
Kog=AH + a 


Ot 


598 Chapter 19 


3. A universal 2-dimenstonal integral invariant is a 2-form @w on T*M with the 
property that for any 2-chain D we have 


oO = WwW 


D 61 (D) 


for the flow {¢;} of any vector field Xq given by Xq 1 @ = —dH for some H. 
Lee’s theorem says that any such @ is just c@ for some constant c. 

Conclude that if f: T*M — T*M has the property that f, takes any vec- 
tor field of the form Xy into one of the form Xg, then f*w is cw for some 
constant Cc. 


CHAPTER 20 


SYMPLECTIC 
MANIFOLDS 


Generalize! 

Let no one clse’s work evade your eyes, 
Remember why the good Lord madc your eyes, 
So don’t shade your eyes, 

But generalize, generalize, generalize! 


— with apologies to Tom Lehrer 


ike many another specialized type of manifold, symplectic manifolds are 
based on a particular structure on vector spaces. 


Symplectic vector spaces. A bilinear map w: V x V — R on a real vector 
space V is nondegenerate if it has the following property: 


if w(v,w) = 0 for all w, then v = 0. 


When is symmetric, with w(v, w) often denoted by (v, w), we call V together 
with ( , ) an inner product space. When @ is skew-symmetric, on the other 
hand, with w(v, w) often denoted by [v, w], we call V together with [ , ] a 
symplectic vector space,! and [ , ] is often called the skew-inner product. 

For a basis e1,...,eq of V, the matrix A = (a;;) of @ is defined by 


aij = w(e;,€;) ee aa eerrern e 
Equivalently, if we define a: V > V* by 
a(v)(w) = w(v,w), 


then the matrix of @ with respect to the basis e; of V and the dual basis e;* 
of V* is At (with our convention for writing matrices with the columns repre- 
senting the image vectors). ‘The rank of @ is the rank of a, that is, the dimension 
of a(V), so w being nondegenerate is equivalent to saying that the rank of w 
is the dimension of V. 


! The origins of the strange word “symplectic” are discussed in Weinstein [2]. 


599 


600 Chapter 20 


When @ is symmetric or skew-symmetric, we can find a basis for which A is 
especially simple, even without the nondegeneracy condition. For symmetric , 
we first note that either version of the “polarization identity” 


[(v + w,v + w) — (v,v) — (w, w)], 
[(v+w,v+w)—(v—w,v—w)], 


Al— Nie 


shows that if ( , ) is not identically 0, then (v, v) is not always 0, and hence there 
is some e; € V such that (e;,e;) =a, = £1. For eS {fu EV: (v,e,) =0} 
we have 

(R-e1) Pe; = V7 


since the intersection of the two subspaces is clearly {0}, while for any v € V 
we have 
UU = ay (v,e1)ey S er 


If { , ) is non-zero on e,+, then we can choose e2 € e;~ with (€2,€2) =a2 = 
+1. Continuing in this way, we find a basis e1,...,@g for which the matrix of 


( , ) 1s ; 
1 
“~, 0 
6 9 


When @ is nondegenerate, this 1s just the Gramm-Schmidt orthogonalization 
process. Note that in this case, for any given non-zero vector e, some multiple 
of e can be chosen as e}. 

In the case of a skew-inner product @ = | , ], the situation 1s a little more 
involved. The rank of @ turns out to be 2n for some integer n, and we get a 
basis for which the matrix involves our old friend J = Jn, 


OF Sa I. 


2n d—2n 


but the proof isn’t much more complicated. If w # 0, then there are two vectors, 
which we will call e; and en+1, with w(e1, en+1) 4 0, and by dividing e; by a 
constant we can assume that w(e;, €n+1) = 1. In the 2-dimensional subspace W 


Symplectic Manifolds 601 


spanned by e€1, €n+1, the matrix of @ 1s 


(“1 0) 


Now we have V = W @W+, and we continue, exactly as in the symmetric case. 
Again, for the nondegenerate case, a multiple of any non-zero vector can be 
chosen as €1. 

It is straightforward to check that for the dual basis e1*,...,eqg*, we can 
write @ as 


n 
(a) oO= yer" A €n+i = 2n<d. 
i=] 
Clearly, a skew-symmetric w can be nondegenerate only if V has even dimen- 
sion d = 2n, and in this case we find, as on page 512, that in A” (V), the vector 
space of alternating functions from the n-fold product V x---x V to R, we have 


(b) wWA++Aw = (—-1)anley* A+++ A e2n* 


for some N. This is a non-zero element of A”(V), and thus determines an 
orientation of V. 

If V with [ , | 1s a symplectic vector space of dimension 2n, a linear trans- 
formation T: V — V is naturally called symplectic if 


[Tv, Tw] = [v, w] 


for all v,w € V; any basis e1,...,€2n for which equation (a) holds is called a 
symplectic basis; and T is symplectic if and only if takes any symplectic basis 
into a symplectic basis. As on page 569, we conclude that det T = 1. ‘There are 
obvious extensions to linear transformation from one symplectic vector space 
to another. 


Isotropic subspaces. Given two vectors v and w in a symplectic vector space V, 
we will call them skew-orthogonal, v _/ w, if |v, w] = 0. A subspace W of a 
symplectic vector space V is called isotropic if W _L W, that is, if v_Z w for all 
v,w € W, or [v, w] = 0 for all v,w € W. In particular, suppose we choose a 
symplectic basis, which we will suggestively call (q}, ..+59", Pi,-++, Pn), and we 
choose a set of indices £9 and a set &p so that {1,...,} 1s the disjoint union 
{1,...,2} =&g U dp. Then (a) shows that J 1s isotropic, where 


I = subspace spanned by all q' fori € do and all p; for j € &p; 


there are 2” of these isotropic “coordinate subspaces”. 

In terms of (q!,...,q”", P1,..-, Pn), we can define an inner product on V 
by declaring this basis to be orthonormal (this is not invariant, but will still be 
temporarily useful). Since [ , |: V x V — Ris a bilinear function on this inner 


602 Chapter 20 


product space, there is, as for any bilinear function, a unique linear transfor- 
mation T: V — V such that 


(c) [v, w] = (T(v), w) 
for all v,w € V (the matrix of T with respect to the chosen basis is J). 


It follows from (c) that W is isotropic if and only if T(W) is orthogonal to W 
with respect to the inner product ({ , ) just defined. Hence: 


LEMMA. The dimension of an isotropic subspace of V is <n. 


PROOF. The isotropic subspace W and the subspace T(W) have the same 
dimension, and they can’t be orthogonal if they both have dimension > n. ¢% 


THEOREM. For every n-dimensional isotropic subspace W of V, at least one 
of the 2” coordinate isotropic subspaces J is transverse to W, that is, the inter- 
section WM LI = {0}. 
PROOF. Let @ be the isotropic subspace spanned by q',...,q”, and let 

Wi, =WN2@, dim W,; =k <n. 
Any k-dimensional subspace of @, in particular the subspace W;, is transverse 
to some n — k-dimensional coordinate subspace of @, say 


Q, = subspace spanned by q'',...,q'"-*, eae ais 
Wi Q, = {0}. 
Choosing j1,..., Jz so that {i1,...,in—x, Ji,---, Je} are n distinct numbers in 
{lysis srt}, let 
I = subspace spanned by q"!,...,q'"-*, Djs +s Pigs Q,=1INQ. 


We claim that 
WOOL = {0}. 


‘To prove this, we note that 
W, CW andW LW => W,LW 
Q,CcCtfandiLI =>@Q,_LT, 
which implies that 
W+Q,;LWNI = Q@LwWwnoi. 


Since @ has dimension n, the Lemma implies that any vector skew-orthogonal 
to Q must be in @, so we have 


WOICO = WOI=(WN@)N(TNA)=W,N@, = {0}. & 


Symplectic Manifolds 603 


On the manifold T*M we have the standard 2-form w, which gives a sym- 
plectic structure on each tangent space, and a canonical map is just the same 
as one that is symplectic on each tangent space (going into the tangent space at 
a different point). 


1. COROLLARY.! If g: T*M — T*M is canonical, then in a neighborhood 


of any point (q, p) we can write {1,...,n} = &g U dp in such a way that, in 
the notation introduced on page 589, 
0(q,O°,P 
(x) ppg (CSD) One) ae 
1(q,P) |g.) Be dp. 
PROOF. Considering T*M as R2", in the tangent space of R2” at (q, p), let @ 
be the isotropic subspace spanned by q!,...,q”, and let 


W = gx(Q) = 842 )(Q). 


Since g is canonical, gy 1s symplectic, so W 1s isotropic. 

Let I be the isotropic coordinate space transverse to W given by the theorem, 
determined by certain g' and p,, and let K be the isotropic coordinate space 
determined by all the other g’ and ps, so that the tangent space of R?” at (p,q) 
is the direct sum K @ I. Finally, let wx be the projection onto K determined 
by this direct sum decomposition. 

Since Jf is transverse to W, it follows that 7x 1s one-to-one on W = gx(Q), 
and this is equivalent to (x). % 


We can easily extend this to a canonical g: T*M xR > T*M xR: ina 


neighborhood of any point q, Pp, to we can write {1,...,2} = &g U &p im such 
a way that 
0(q,Q%, Pg,t 
09, P.t) —|G,8,t0) Bp € dp. 


Applying this to the discussions on pages 586 and 589, we have 


2. COROLLARY. For a Hamiltonian system H there is a canonical transfor- 
mation (q, p) +» (Q,P) and &g and &p as above, for which Q*® = constant 
are solutions for a € £g and Pg = constant are solutions for B € &p. 


| This proof, and the preceding theorem, comes from Arnold [2; §48B]. A less abstractly 
presented proof, perhaps the original(?), appears in Carathéodory [1]; §§96, 103]. 


604 Chapter 20 


Symplectic manifolds. A 2-form @ on M which is nondegenerate, as defined 
on page 512, is simply one for which each w(g) is nondegenerate on the tan- 
gent space Mp, thus giving us a symplectic structure on each tangent space. 
One might think that (/, @) would be called a “symplectic manifold”, but the 
definition actually adds one more condition: 


DEFINITION. (M,q@) is a symplectic manifold if w is nondegenerate and 
dw = 0. 


The prototypical example of a symplectic manifold is, of course, (T*M,@). 
[Despite the possibility of confusion, we use M both for a symplectic manifold, 
and in situations like 7*M.] A classical theorem of Darboux shows that locally 
every symplectic manifold looks like such a manifold. Happily, there is now a 
delightfully short proof of this theorem, due to Weinstein [1], after Moser [1] 
(also compare DG, Prob. 8-27), which has the advantage of working for infinite 
dimensional manifolds, although we won’t be considering such manifolds here. 


THEOREM (DARBOUX). Let @ be a nondegenerate 2-form on a mani- 
fold M of dimension 2n. Then dw = 0 if and only if around each point 
p € M there is a symplectic coordinate system, that is, a coordinate system 
(Gccagg’ Da: .. +) Pn) 10 which 


n 
w= » dp; \ dq’. 
i=] 
[Thus dw = 0 for a nondegenerate skew-symmetric w corresponds to R = 0 
for a nondegenerate symmetric ( , ) (a Riemannian metric), smce R = 01s the 
condition that there is a coordinate system for which ( , ) = )-7_, dx’ @dx' |] 


PROOF. If @ has this form, obviously dw = 0. For the converse, it suffices to 
find around each point a coordinate system in which @ has constant coefficients, 
since the procedure on pages 600-601 can then be applied simultaneously to 
all tangent spaces at the points on which the coordinate system is defined. 

Since we are working locally, we assume that M = R”, and let p be the 
origin 0. Let @; be the form with constant coefficients that equals w(0) at 0, 
and define 


@, =t(@)+U—-to=ow+t(a; —o) O<t<l. 


Then each w;(0) = w(0) 1s nondegenerate, so there is a neighborhood of 0 on 
which each @; is nondegenerate for all 0 < ¢ < 1. Shrinking this down to a 
ball, we can write the closed form @,; —@w = da for some 1-form a, and we can 
assume a(0) = 0. 


Symplectic Manifolds 605 


For each f, let X° be the smooth vector field with X’(0) = 0 and 
X' 1a; = —a. 
Since X‘(0) = 0, by restricting the ball further if necessary, we can assume that 
the 1-parameter family of diffeomorphisms f; generated by X’ are all defined 


on [0, 1]. (his 1-parameter family is not usually a 1-parameter group.) 


We then have 


d x 2 * 

—(fito1) = fr Lx) + fi (£01) (cf. Problem 1) 
= fi*(d(X' 1@,) + 0)) + fi*(@1 —@) by the Cartan formula 
= fi*(-—da + w, —w) = 0. 


Therefore f1*@, = fo*@o = @, so /f; 1s the coordinate system that transforms @ 
to the constant form @,. 


Aside from making abstract symplectic manifolds analogous to the examples 
(T*M,q) that spurred our interest in symplectic structures in the first place, 
dw = 01s also the condition that will make Poisson brackets work on abstract 
symplectic manifolds. 


Poisson brackets. ‘The definition of Poisson brackets was originally made for 
T*M x R (actually, R?”*!), and that is where our discussion will begin also. 
Given a Hamiltonian H on T*M xR, in order for a function f: T*MxR— R 
to be an “integral of the motion”, that 1s, for f to have a constant value on each 
solution c: R — T*M of Hamilton’s equations, we need for each solution c that 


0= “fel. 


= FOL FE 0.0-G 000 + FLCO.N (HO 


205 a 
= Fe.9+ DOF 








of 
ODi 


Since there are solutions peter every point, we thus need 
0 Of 0H Of oH 

aes 3 tL | 

ot dq! dpi 9p; 0g' 


For any two functions f, g: ed x R — R, the Poisson bracket { f, g} was 
defined classically as 


nT*M. 











of dg df dg 
eee ee dpi Opi Ogi 





606 Chapter 20 


Thus, f is an integral for H if and only if 


of 
ot 
In particular, in the case where H and f do not depend on ¢, we simply have 
the condition 
{f, H} =0. 


It will be convenient to restrict our attention mainly to this case, although the 
general case will arise a couple of times. 

Since this classical definition depends on the choice of the canonical coordi- 
nates (q, p), a proof was needed that it is actually well-defined—for a canon- 
ical transformation, given by (Q, P) as on page 570, we have {f, gh}ig,p) = 
{f, g}10,P], with the subscripts denoting the coordinate systems used for the 
computations. But it is easy to give an invariant definition right from the start. 
Remembering that the invariant definition 

Xz Iw = —dg 


+{f,H}=0. 


can be written, as on page 533, as 


“. dg a dg oO 
ee 


“= Opi dgi dq! Opi’ 
we see that 
tf, 8} = Xe(f). 
Moreover, since X¢ 1 @w = —df, we have 
@(Xo,X¢) = —df(Xg) = -1f, 8}, 
or simply 


tf &} = o(Xy, Xz). 
Whichever route we take to defining the Poisson bracket, it is clear that { , } 
is skew-symmetric and bilinear over R. One immediate consequence of skew- 
symmetry is the 


HAMILTONIAN FORM OF NOETHER’S THEOREM. Given functions 
f.g: T*M — R, suppose that the function f is constant along the 1-parameter 
group of canonical transformations determined by the vector field Xz. Then g 
is an integral of motion for Hamilton’s equations with Hamiltonian /f. 


PROOF. The hypothesis says that 0 = Xg(f), so that 0 = {f,g}. Hence 
O= tg, fi = X(8). & 


The next thing we want to note is that Poisson brackets provide an easy test 
for a map to be canonical: 


Symplectic Manifolds 607 


THEOREM. @¢: T*M — T*M is canonical if and only if ¢ preserves { , }, 
that 1s, 


Of gt ={O"*f o*g} ie, {fighodg' ={f og", god". 


PROOF. An unraveling of definitions shows that for a vector field X¥ on M we 
have 


X(f)op' = oeX(f og"). 
Therefore 
{fi g}o g" = Xe(f)° o" = dxXe(f o eo), 
while 
{fob "8 oP '} =Xgog-ilf 067"). 
Thus {f.g}og ! ={fog!,gog7}} if and only if 
X gop-! = dxXg 
for all g, which by Theorem | on page 570 is true if and only if @ is canonical. % 


It is easy to check that for any canonical coordinate system (q, p), we have 
the “fundamental Poisson brackets” 


aS. Apap 0, 4950) = 6s. 
and conversely, these relationships implies that (g, p) is canonical (Problem 2). 
Consequently, a map (q, p) +> (Q, P) is canonical if and only if 


{0',O/}=0, {PF:,Pj}=0, {0', Pj} = 8. 


The most important property of { , } shows that it makes the C° functions 
on M into a Lie algebra: 


THE JACOBI IDENTITY. On T*M we have 
AAS ASS TAA BS FAS tA, FES = 0. 


This can be checked by a straightforward, though somewhat involved, calcu- 
lation, which is often simplified by the following device: When expanded out, 
this expression 1s a sum of terms consisting of the product of two first derivatives 
and a second derivative. The terms containing second derivatives of f will be 


AAP 83} + Ag. th, f3} = (Xe Xn — XnXe)(f) = [Xe Xa] (/). 


But the formula for the Lie bracket of two vector fields involves only first deriva- 
tives, so no second derivatives of f appear, and of course the same is true for 
the second derivatives of g and h. 


608 Chapter 20 


Jacobi concocted his identity for a very specific reason: it immediately im- 
plies a result that Poisson had proved many years earlier by rather complicated 
computations. 


COROLLARY (POISSON’S THEOREM). Consider a Hamiltonian H on 
T*M, and functions f,g: T*M — R. Suppose { fi, H} = 0 and { fo, H} = 0, 
so that f; and fo are each integrals of motion for Hamilton’s equations for H. 


‘Then also 
{ihi, fo}, H} = 0, 


so that {f,, fo} is also an integral of motion. 
(The proof for the case of T*M x R requires a bit more work, cf. Problem 3.) 


Unfortunately, attempts to apply Poisson’s theorem may turn out to be dis- 
appointing, because the newly derived integrals all too often turn out to be 0, 
or a combination of previously derived integrals. Problem 5 gives the standard 
elementary example of a case where a new integral is found. 


Poisson brackets bis. All these considerations can now be applied to arbitrary 
symplectic manifolds (M,q@) (and extended analogously to M x R). In fact, 
Darboux’s theorem shows that around every point we can find a symplectic 
coordinate system (q, p), for which we have w = )~7_, dp; A dq', and we can 
then simply repeat all the calculations made for T*M. 

But of course, we're not going to take that easy way out! Aside from the 
masochistic pleasure that we will derive from doing everything invariantly, we 
will be able to isolate just where the condition dw = 0 plays a crucial role, and 
we can always hope that we’ll obtain some enlightenment along the way. 


First of all, for a symplectic manifold (M,w), we define a diffeomorphism 
ff: M — M to be symplectic if f*@ = w. Moreover, given H: M — R, we 
define the vector field Xy exactly as before, 


Xy Iw = —dH, 


and again call vector fields of this form Hamiltonian vector fields. Using the 
Lemma on page 570, we prove, exactly as before, the corresponding 


Sl. THEOREM. The map f: M — M is symplectic if and only for all H 
fx (XH) as X f* H. 


In particular, if f is symplectic, then f, always takes Hamiltonian vector fields 
into Hamiltonian vector fields. 


Symplectic Manifolds 609 


In addition, the first proof of Theorem 2 on page 576 goes through exactly 
as before (using the crucial hypothesis dw = 0) to prove 


$2. THEOREM. The flow of any Hamiltonian vector field on the symplectic 
manifold (M,q@) consists of symplectic maps. 


On any symplectic manifold (M, ), we can use the same definition as before, 
{fg} = w(%y, Xe) = Xe(f), 


and for a symplectic coordinate system we can easily compute that {f, g} is 
given by the formula in the classical definition. 

The argument on page 607, with Theorem SI in place of Theorem | on 
page 570, gives 


$3. THEOREM. ¢: M — M is symplectic if and only if for all fg: M > R, 
O*{ fg} ={P"f b*8} or {fhg}op' ={f og", gop }. 
As with T*M, a coordinate system (g, p) is symplectic if and only if we have 
the fundamental Poisson brackets 
{q',q7}=0, {pi,pj}=0, {gps} = 8}, 
and (qg, p) +> (Q, P) is symplectic if and only if 
{O',Q7}=0, {P,P} =0, {', Pj} =). 


We also have a result that plays an important role right now as well as later on. 


94. THEOREM. On any symplectic manifold, 
[X7, Xe] = —Xy p93. 
PROOF. By the two Cartan formulas on page 576 we have 
[X~,Xg]tw = Lx, (Xgtw)—X,g (Lixo) 
= d(X¢ 1(Xgt1w))+Xy~itd(Kzg iw) +0 
= d(w(X+, Xz)) +Q= —Ka(X Xz) Jo, 


SO 
[X¢, Xe] = —Kuyx,,.x,) = —Xtpe}. & 


610 Chapter 20 


1. GOROLLARY (THE JACOBI IDENTITY). On any symplectic manifold, 
LEAS PARA Ss Ae eT sy =O 
PROOF. We have 


[X¢, Xe ](A) = Xp (Xg(h)) — Xz (Xp (A) 
= X/ (th, g}) — Xo (th, fh) = tth, 83, F3 — MA, Fh. 83, 


while Theorem $4 gives 


[Xy~, Xg](2) = —Kppg3(h) = th, {fh gh}. & 


2. GOROLLARY (POISSON’S THEOREM). If {f, 1} = 0 and {g, H} = 0, 
then 


Us 83, A} = 0. 


3. GOROLLARY. For f.g,h: M — R we have 


Xai 83) = {kan(f). 8} + th, Xalg)} 


(the “infinitesimal” version of ‘Theorem $3, compare Problem 7). 


4. GOROLLARY (Another generalization of Noether’s theorem). ‘Two vector 
fields Xy and Xx on a connected symplectic manifold M commute, that is, 
[Xz,Xx] = 0, if and only if {H, K} 1s constant. 


PROOF. By Theorem S4, [Ky ,Xx] = 0 is equivalent to X¢y,x; = 0. Since 
X(H,K}1@ = —d(tH, K}), 


o, 


and w is nondegenerate, X;4,x} = 0 is equivalent to d({H, K}) =0. 


Symplectic Manifolds 611 


PROBLEMS 


1. For a l-parameter family {X’} of vector fields on M let {¢;} be the cor- 
responding 1-parameter family of diffeomorphisms of M, with ¢o = identity, 
that is generated by {X‘}, so that for f: M — R we have 


SF (Oran(p)) — f (P:(P)) 


(X'f)(p) = lim ; 


For a family w; of k-forms on M we define the k-form 


: -  Wtth — Wt 
WO, = lim ————. 
h—0 h 

Show that for n(t) = ¢;*@; we have 
ne = or (Lx: + Or), 


or more informally, 


dt pr Or = 1 Xt Wr Ae 
2. Let (g, p) be a coordinate system on a symplectic manifold (M, @) satisfying 
{q',q/} =0, {pi, pj} = 9, Cie eee 


(a) Letting A be the matrix of @ in the coordinate system (q, p), show that for 
B = A™! we have — ; 

(Q.quy = be. 
(b) Similarly, 


pies, tp ppb ee, 


af OU Slee. ees 
4=(;, y yas 


and (q, p) is a symplectic coordinate system. 


(c) ‘hus 


Note that this is just the symplectic analogue of the fact that in a Riemannian 
manifold, if a coordinate system (qg!,...,q”) satisfies 


a ayy, 
dg! dq! 


612 Chapter 20 
then the metric is given by 


( ,) =) 0 dq' @ dq’. 


i=] 


3. (a) For f,g: M x R > R, we have 


0 of Og 
wheh= | peat + Leet 


b) If f and g are integrals of motion, then so 1s { f, g}. 


4. (a) Xpp = gXn + AXg. 
(b) if, SA} = Bt AA} + ALS, 8} 
( 


c) More generally, for a function ¢: R* — R we have 


k 
{Po (fis... fe 8} = >, Dab (fis---s fied {far 8}; 


a=] 


where Dz is the partial derivative with respect to the a" argument. 


5. In R?, calculate 
2 3 3 1 
(d° P3— P2, 4 Pi-q P33 

(this expands out into 16 terms, of which only 2 are non-zero; Problem 4 (b) is 
useful). ld ,b—Ud ,B :1amsuy 
Conclude that if angular momentum around the first and second axes are con- 
served for some Hamiltonian, then angular momentum around the third axis 
is also conserved. 


6. If {¢;} 1s the flow of Xy, then 


d 
— fod: = tf br, H} =f Hho bs 


(sometimes written as f = { f, H}, and called the equation of motion in Poisson 
bracket form). 


7. (a) If {¢;} 1s the flow of X7, show that 


d d ; 
F| Serf teasr-a d(o:" f) = —d(Xu f) = Xx, f J, 
El y=0 El+=0 


Symplectic Manifolds 613 


so that we have 


d 

_ Xg,*f =X&X 

dt rer orf Xa f 
(it may be useful to compare Problem 19-1 (b) for the justification of some steps). 
(b) ‘Taking the derivative at 0 of 


b:° (th gh) = {oe f O18}, 


which follows from ‘Theorem $3, prove Corollary 3. One could then use this 
result to prove the Jacobi identity without using Theorem $4, and ‘Theorem $4 
itself from the Jacobi identity. 


CHAPTER 21 


LIOUVILLE 
IN TEGRABILITY 


|, iouville’s second main theorem about Hamiltonian mechanics characterizes 
a large class of Hamiltonian systems that are “integrable”, meaning that 
they can be solved completely by quadratures (in terms of indefinite integrals, 
and solving for implicit functions), and it covers virtually all systems with this 
property that were known to classical mechanics. 


Functions in involution. In Chapter 18 we saw that a complete integral for the 
Hamilton—Jacobi equation for some Hamiltonian on T*M xR allows us to solve 
Hamilton’s canonical equations by quadratures. Moreover, by Corollary 2 on 
page 603, there is a canonical transformation (gq, p) + (Q,P) such that n 
curves Q® = constant or Pg = constant are solutions, with the a’s disjoint from 
the B’s. ‘The fundamental Poisson brackets on page 607 or 609 show that in this 
case, any pair from among this particular set of Q® and Pg are in involution, 
that is, their Poisson bracket is 0. 

From this point of view, rather than viewing a case where the Poisson bracket 
of two integrals fj and f satisfies { fi, fo} = 0 as a disappointing aspect of 
Poisson’s theorem, it can instead be considered as a promising circumstance, 
suggesting that f; and f2 might be part of a symplectic coordinate system in 
which Hamilton’s equations become trivial to solve. 

In fulfillment of this promise, Liouville’s theorem shows that the existence 
of any n integrals in involution with each other guarantees that we can solve 
Hamilton’s equations by quadratures. In fact, given n independent functions 
fi.---, fn on T*M xR that are in involution with each other, there is a way to 
find, by quadratures, a symplectic coordinate system (Q, P) with P; = f;, and 
if the fj are all integrals for a Hamiltonian H on T*M xR, the Q' obtained in 
this way will also be integrals for H, so that having half the maximum number 
of independent integrals will automatically determine a complementary half, 
allowing Hamilton’s equations to be solved by quadratures. 


More generally, and in more detail, consider a symplectic manifold (M,@) 
of dimension 2n, with symplectic coordinates g',...,q", Pi,..., Pn around a 
point Po. Suppose the functions fi,..., fp: M xR — R are in involution with 
each other, {f;, fg} = 0, and are independent at (70, fo), that is, dfi,...,dfh 
are linearly independent in a neighborhood of (po, to). Then for any constants 


614 


Liouville Integrability 615 


Q\,...,@n, the set N C M, defined by N = {p: fj (pr, to) = a; for all 7} 1s, in 
a neighborhood of #0, an n-dimensional submanifold (if it is not empty). 
If we consider the vector fields X1,..., Xn defined by 


X; lo = df; (in our usual notation, X; = —Xy,), 


then the X; are linearly independent at any pg € N, since @ is nonsingular; 
moreover, the X; are all tangent to N, since 


afk (Xj) = Xj(fk) = —X Sk) = Afi» Skt = 0. 


Since the tangent space of N at g is thus spanned by Xj,...,X, at p and 
w(X;,Xx) = (fi, fx} = 0, we have the 


ISOTROPY LEMMA. Each tangent space of N 1s isotropic: @ restricted to 


the tangent space is zero. 


Our assumption concerning independence of (/1,..., fn) means that the 
n X 2n matrix 
Ofk 
3(q', Pi) 


has rank n at (70, to). It follows that for g in a neighborhood of po, to which 
we now restrict our attention, among the q’, p; there are n coordinates 


0 
rj,...,%m with O+ det ia to). 
Or; 


From the isotropy lemma and the theorem on page 602 we conclude that we 
can choose the coordinates r;,...,7%, so that 


(a) the indices for the g’s and those for the p’s are disjoint. 


Now the map (q', pi) +> (pi,—q') is canonical, as pointed out on page 571. 
We can make this switch for any q' in our collection, and (a) insures that we do 
not get any duplications of the p;. ‘Thus, simply by renaming coordinates, we 
can assume that in fact 


0 
O ~ det OH po to). 
Pj 


These rather non-classical considerations will serve as preliminaries for a quite 
classic proof of Liouville’s theorem, though we state it more generally in terms 
of symplectic manifolds. ‘The modern (re)formulation begins immediately af- 
terward, on page 620, and it will be quite interesting to see how various steps 
in the classic proof correlate with steps in the modern treatment. 


616 Chapter 21 


THEOREM (LIOUVILLE). Let (M,@) be a symplectic manifold of dimen- 
sion 2n, and let fj,..., fy: M xR — R be functions that are in involution with 
each other, and such that df),...,df, are linearly independent at (7, fo). 


I. Then there are functions Q!...,Q”": M x R > R in a neighborhood of 
(72, to) such that the coordinate system (Q!,...,Q”, fi,..., fn, t) is symplectic, 
and (Q!,..., Q”) can be found by quadratures. 


II. Moreover, if the f; are integrals of a Hamiltonian H on M x R, our 
method of choosing the Q’ will ensure that they are also integrals. 


PROOF. We work locally, in a neighborhood of (p, fo), identifying M with 
R” x R”, and, as on page 615, renaming the coordinates (g, p) on M so that 
we have 


(*) 0 # det ete. ie. 


I. Condition (*) implies, by the implicit function theorem, that there are func- 
tions w,: R” x R” x R > R with 


(1) fila, V@, p.t),t) = pj, w= (W1,..., Wn); 
note that we then also have 

Op; Of OWi (*) 
2 6& = HL a = —— ] 1s nonsingular. 
¢) z Opk =e Ap) OPK Opk 2 


Taking 0/dq° of equation (1) gives 


si Of; OWr 
+d pr age 











which we can write in terms of matrices as 


Of 
A+ BC =0 for a= (54), 
dq 











_ ( i 
7 t! 


“Of; Ofte — af; fk 
Ur (fis fi = D0 K) K) 
“ 9q° Ops — Ops 9G 


_ (ar 
= (3). 


The hypothesis 





can be written ABt = BA‘, hence B7!A = At(Bt)~!, and thus C = Ct, ie 


r _ OWs 
(3) i = = for all s,s. 








Liouville Integrability 617 


[We have thus shown that }“7_, Wi dq’ is closed (as a function only of the q’), 
by an argument similar to that used for the isotropy lemma.] Finally, (3) implies 
that there is a function S: R” x R” x R — R such that 


as 
=(q, pst), 


(4) Wilg, p.t) = agi 


where S is can be written as 


(q,p,t) 
(5) S(q, p.t) =|. Wily, pit) dy’ +--+ Vn, pt) dx”, 


q,P,to 


the integral being taken along any path from a fixed (q, P, to) to (q, p,t). 
We have 07S /dp;dq' = d/dp;(dS/dq') = d;/dp;, and (Ovi /dp;) is non- 


singular by (2), so 26 
—_—___— is nonsingular. 
G = 


We can thus use S to obtain a type 2 generating function, as on the bottom of 
page 580: We first find n functions P;: R” x R” x R — R satisfying 


as 
(6) Pi = agi P(q, p,t),t), |e 0 rarer ar 


and then define the Q' appropriately [the actual definition isn’t important here, 
just the fact that S is a generating function for (Q, P)]. 
In view of (4), we can write (6) as 


(7) pi = Wig, PQ, p,t).t), 


and thus 


(8) fiaptn=f(¢a.v(a. P@ p.t).t).t)=Pigp.t) — by (I), 


proving the first part of the theorem. 


II. Now suppose, in addition, that the f; are integrals for a Hamiltonian H 
on M xR. Substituting (8) back into (7) gives 


(9) Pr = Vr(q. f(@, pot), t), f=(h,..., fn), 


and taking the partial derivative with respect to t, g*, and ps then gives us the 


618 Chapter 21 


following equations [all partial derivatives of , evaluated at (q, f(q, p,t),t)]: 




















= OWr Ofi 
10 
oo) +d ODj ‘ot’ 
OWr oft 
l 
(1) > ar 
~~ OW Ofi 
12 = : 
\ | : Opi ODs 
But the f; are integrals, so (page 606) we have 
OY + tf, H} =0, 


and substituting into (10) we get 


_ avr (# 0H OH *t ) 
13 sa ec 
Us) 2 Op; pa dq’ Ops —-0g* Os 








s=1 


Together with (11) and (12), and remembering (3), this gives 


7 (eee) 


| Ops dq” 
Now consider a new Hamiltonian G defined by 
Gq, p.t) = H@,¥@, p.t),t). 


Taking (9) into account, we see that 














sds FQ PW = 5 t) +a P, ne Msg, Iq; p,t),t) 


= Hg, f(a. p.t).t) by (14), 


and thus, in general, 
0G OW, 
dg” st 








(19) 
But then (5) gives [adding a constant to G if necessary] 


0s 
- [ dG = -6 so that G+—=+0. 





Liouville Integrability 619 


Equation (F2-c) on page 584 then shows that the Hamiltonian K for G in the 
(Q, P) coordinates is K = 0. This, in turn, means that for any solution c of 
Hamilton’s equations for G, we have Q oc = constant. 

On the other hand, if y is a solution of Hamilton’s equations for the original 
Hamiltonian H, and we let 


c(t) = va(v(t)), Fav), PY), ¢), 0) 


= y(q(y(t)), @1,...,@n,t) for certain constants a;, 


then we find that c satisfies Hamilton’s equations for G, so 


O'(y(t),t) = O'(q(y@),a1,....4n,t) = O' (c(t), t) = constant. % 


When the Hamiltonian H and the functions f{,..., f, do not depend on f, 
the approach on pages 587—588 works out as follows. Equation (10) is now 
superfluous, and since the f; are integrals, we have 


cay ary aH af) 
(Ge (a 
a an ag? aps) 


from which we obtain the special case of (13), 
OWr wx (Of; 0H —OH- Of; 
we » ap; > (sean, gs m) 
and thus, using (11), - and (3), the - case of (14), 


o=-(4 oe 0H Wve) 


Ops dq” 
For the Hamiltonian G we then find special case of (15), 


0G 
dgr 














so that we have 
dS(q, P 
y (« (q, P) 
dq 
exactly as in the case of the equation (*) on page 587. So in this case we 
simply conclude that under the canonical transformation obtained in part I of 
the proof, the solution curves y have all P;(y(t)) = a; for certain constants a;, 


while the Q'(y(t)) are of the form 
O'(y(t)) = Q'(y(0)) + trian... . Qn) 


for certain functions 1;. 


) = H(q,W(q, p)) = K(P), 


620 Chapter 21 


We’ve noted this special case, and its specific outcome, because nowadays 
“Liouville’s theorem” usually refers to a result involving an additional important 
feature of mechanics problems, and the original result has become but one 
strand of a geometric tapestry in which analytic calculations are largely replaced 
by geometric constructions. This additional feature involves Hamiltonians that 
are time-independent, which will be reflected in the whole geometric approach. 


Conditional periodicity and the invariant tori. One of the very first problems of 
mechanics that we considered was motion under an inverse square force, with 
the solutions neatly separating into two classes, the unbounded orbits, parabolas 
and hyperbolas, and the bounded elliptical orbits, which are periodic. 


The pendulum is an even simpler situation where we encounter periodic 
motion. It should be pointed out that the pendulum doesn’t have to oscillate, as 
in (a); it can also rotate, periodically, about the pivot point (b) if started with a 


-" >-~_ 





large enough initial velocity. In addition, the pendulum can simply hang straight 
down, or, as on page 214, remain vertically above the pivot, when our “string” 
is a massless thin rigid rod (or we are dealing with a “compound” or “physical” 
pendulum). As we will point out later on, there are also two non-closed solution 
paths, currently hiding out, teasing you to discover them. 


The equation for harmonic oscillation x” + w*x = 0 that one gets for springs 
or simple electronic circuits again gives periodic solutions. However, when we 
consider the 2-dimensional case (pages 293-295), with the x,-coordinate and 
the x2-coordinate separately satisfying 


" 2 
xX; +@,°x; = 0, 


" 2 
x2 +2°X2 = OV, 





we obtain Lissajous figures, which can be periodic, but need not be, even though 
the components do exhibit periodic motion, since the periods might not be 
commensurable. This is called “conditionally periodic” motion [only under 
special conditions does the motion become periodic]. 


Another example of such conditionally periodic motion is given by the spheri- 
cal pendulum on pages 290-291. Here the pendulum rotates around the z-axis 
as @ goes from 0 to 2z, while @ oscillates between arccos uv; and arccos 2. 


Liouville Integrability 621 


Conditional periodicity also occurs for the orbits discussed on pages 128-129 
(in fact, it was astronomers who first introduced the term), and the herpolhode 


of Chapter 9. 


For a 3-dimensional example, recall (page 349) that the motion of a heavy 
top is generally determined by three periodic motions, the rotation about the 
axis, the nutation, and the precession ¢, which need not have commensurable 
periods, so that we again have a conditionally periodic motion. 


Finally, page 478 gives an n-dimensional example of conditional periodicity. 


When a periodic motion is not an oscillation, physicists often like to describe 
the motion in terms of “multiple-valued functions”. For example, when a pen- 
dulum doesn’t oscillate, but instead rotates about the pivot point, 6 may be 
thought of as taking on values in [0, 27), followed by values in [27, 47), etc. 
For the conical pendulum, ¢ 1s similarly multiple-valued; and of course for the 
heavy top, the rotation about the axis and the precession ¢ can be regarded as 
multiple-valued functions (while the nutation is an osculation). 


Mathematicians also employ the lingo of multiple-valued functions, but they 
have the luxury of alternately describing these same phenomena geometrically. 


INVARIANT TORI THEOREM (ARNOLD). Let (M,@) be a symplectic 
manifold of dimension 2n, and let fj, fo,..., fn: M — R be n functions 
that are in involution with each other. For some n-tuple of constants a = 
(aj,...,Qn), let 


Ma ={x €M: fi(x) =a;} ie, Ma =f '(a) forf =(f,..., fr), 


and suppose that the df; are linearly independent at each point of Mag (thats, f. 
has maximal rank n at each point), so that Mg 1s an n-dimensional submanifold. 
Then each (non-empty) compact component Cg of Ma 1s diffeomorphic to 
an n-torus S! x---x S?, 
Moreover, if the f; are in involution with some Hamiltonian H: M — R, 
then the solutions for Hamilton’s equations (the solution curves y of X77) take Cg 
into itself, and Ca has coordinates (y',...,g”) such that these y satisfy 


g' (y(t) = o' (y)) + vi(a)-t 


for constants v;(a), the [circular] frequencies for Cg (the usual frequencies will 
then be v;/2z, and the periods, the reciprocals of the frequencies, are 27/v;). 


Note: The theorem involves circular frequencies because we will choose the g! 
to repeat on intervals of length 2, rather than length 1. ‘This convention is not 
universally used, which one has to bear in mind for the various formulas that 
will occur starting on page 631. 


622 Chapter 21 


PROOF. As before, we consider the vector fields X1,..., X» defined by 
Xjio=dfj (Xj =—Xy,). 


We have already seen that they are linearly independent everywhere on Ma, 
and everywhere tangent to Mg. 

In addition, the vector fields X; commute, that 1s, [X;, X;] = 0 for all j,k, 
since Theorem $4 on page 609 gives 


[X;, Xx] = [xy Xf] = —X tf; , fk} ==.) 


Since Cg is compact, the flow {p;'} of X; on Cg is defined for all ¢, and since 
the X; commute, the {o;’} and {o,°} commute. For t = (f1,...,t,) € IR” define 


O+++0 py, 


ge Cy aie C; by pt a pul ehes tn) = 01 


Since the p;’ and pxz*° commute, we have p't$ = pt o pS for all s,t € R”. 
Choosing some fixed point p € Ca, we can define 
p:R" >Ca by _— p(t) = p(R), 
so that 2 1s moved along the first flow for time ft), then along the second flow 
for time tz, etc. Commutativity of the flows implies that p(t +s) = p(t) p(s). 
Since the X; are linearly independent at p, the implicit function theorem 
shows that a neighborhood of 0 € R” will be taken by p onto a neighborhood 
of 2 in Cg. Moreover, if @ = p(t), and p is the map defined like p, but using p 
instead of p, then we have 


posses eee ae S 
p(t +s)= pls), #° e 
which shows that in fact the image p(R”) is an open subset of Ca. By the same 


token, it also clearly a closed subset, for if @ 1s in the closure of the image, 


t S 
pe op. . > << 
p Fi p 
then the image of its 6 contains a point p(s) = p(t) and then pg = p(t +s). 
It thus follows that the mage of p: R* — Cg 1s all of Ca. 
It is easy to see that the set 


G=j{teR": p(t) =p} 


is a subgroup of IR”, and moreover we get the same subgroup if we start with 
a different p (and correspondingly different p). Finally, the fact that a neigh- 
borhood of 0 € R” is taken onto a neighborhood of g [no matter which p we 
choose] shows that G is a discrete subgroup of R”. 

An easy exercise (Problem 1) shows that for some k < n there are linearly 
independent elements b;,..., bg of IR” such that G 1s precisely the set of integer 


Liouville Integrability 623 


linear combinations of the b;, 


G = {m'b, +---+ mb; : m',...,m* € Z}. 





Then Cg is isomorphic, as a quotient group, and diffeomorphic, as a quotient 
manifold, to R”/G. Since Cg is compact, we must have k = n, so that Cg is 
diffeomorphic to the n-torus S! x --. x S$, with universal covering space R” 
under the map p: R” > C,.! 

Every p € Cg 1s o(s'by +---+5"b,) for unique 0 < s' <1, and letting 


g' (p) = 2ns' 


we obtain “angular coordinates” g!,...,g” on Ca, which we can extend to 
multiple-valued coordinates mod 27 on Cag (like the standard use of @ on S). 
We will reserve g [as opposed to ¢] for these coordinates. 

The final statement of the theorem follows from the fact that the f’s are 
constants along the integral curves of XH. % 


Remark. The 9’s are “dual” to the X’s, with 0/dg' being constant linear com- 
binations of the X; on Ca. 


Various names for these tori are: the invariant tori, the Liouville tori, the Arnold 
tori, or the Liouville-Arnold tori (remarks at the end of the chapter may give 
some elucidation). 


# As the simplest, 1-dimensional, example of the theorem, we consider the 
harmonic oscillator, whose Hamiltonian was found in Problem 17-1 (a). We set 
m = | for simplicity, and to begin with, we consider w = 1, so that we are work- 
ing with the equation g +g = 0, and H = +(q? + p?). In this 1-dimensional 
example, our manifold M is now simply T*R = R? with coordinates g and p, 
and we simply choose f; = H. The various Mz = Cg are now the sets of 
constant energy a, 


p 
{(q. p) :q° + p* = 2a}. ©" 


At the origin (0,0) we have dH = 0 and Mo 1s just a point, while all other Mg 


are 1-dimensional tori, the circles of radu V 2a. 


| More generally, if we don’t assume Cg is compact, but assume that the X; are complete, 
we can conclude that Cg is diffeomorphic to the product of R”~* and a k-torus. 


624 Chapter 21 


The collection of these circles, together with the origin, is called the phase 
portrait for this Hamiltonian. Rather than showing how the position q of the 
particle varies with time, the phase portrait shows how p varies with q. Since 
Xu(q, Pp) = (p,—q), a typical solution curve, lying on some Mg, is 


t +> (q(t), p(t)) a Sa 
—J/2a, 2a, 0 
= (V2a cost, — V2a sint); \ \ 


Ast goes from 0 to x, with q going from 2a to —V 2a, the corresponding point 
(q, p) moves from right to left along the bottom of the circle, where p < 0, while 
as t goes from x to 27, with q going from —V2a back to V2a, the point (q, p) 
moves from left to right along the top of the circle, where p > 0. 

In terms of our proof of the theorem, a flow of Xy in M, (a trajectory 
y: R — M,) makes R into a covering space of M,. [Since y goes clockwise, 
choosing b; = | makes the corresponding ¢ coordinate equal —6@ for the usual 
polar coordinate #; choosing bj = —1 would give the standard picture.] On M, 
the trajectory y simply has the equation 


e(v(t)) = (yO) 41-8, or p= "ZNSE = 
g(t) = gO) +1-1 (“condensed”). g = 3n/4 g=n/A 
g=71/2 


Note that the motion along the image of y is an immersion; while the particle 
oscillates back and forth along the line, with velocity 0 at some points, the 
corresponding curve in the phase portrait simply moves steadily around the 
circle (but g can’t serve as a coordinate system at points where the velocity is 0). 


4 If we consider the more general equation g + w7q = 0, with Hamiltonian 
H = 4(w?q? + p*), we will have ellipses rather than circles, and a typical 
solution curve 1s now 


ts (Ss 


so a complete trajectory is traversed on [0,22/w]. This means that in order 
for g to be a multiple-valued function mod 27, we have to let g = w-t along 
a trajectory [in terms of our proof of the theorem, the basis by of G now has 
length 1/w rather than 1], and the trajectories y are thus of the form 


g(t) = 90) +a -t, 
Le, P(y(t)) = P(v(O)) + @<t. 





-coswt, —V2asin ot), 


Liouville Integrability 625 


Moreover, the line g = k no longer corresponds to the line 6 = —k. For 


each @~, we are at the point (Vv 2a/wcosg~,—V 2a sin y), with slope —q tan ¢, 
rather than — tan g, so all the slopes are multiplied by o. 


Q ——— ty Y1 
2 
6,3 4 _ A G3 ve 


y = arctan(w tan 0) 


@ The pendulum provides a more illuminating 1-dimensional example. For 
a pendulum of length / (and a bob of mass 1), the Lagrangian 1s 


L(0, 0) = Ye + gl cos6, 


where we take the potential to be V = —g/ cos 0, varying from g/ at 6 = x to 
—glatd=0. 
With @ now denoted by gq, the Hamiltonian is 


p? 


a2 gl cosq. 


H(q, p) = 
Consider first the curves H = E for small £; in the figure below, the numbers 
show the value of £/g/ on the curve to which they point. The important differ- 
ence in this case is that trajectories along the various curves take different times 





to complete one cycle (the pendulum isn’t isochronous), so the frequencies v(E) 
vary from curve to curve, and the set of points ¢ =constant for the different M, 
no longer lie along a straight line. 

For the complete picture, we need to consider not only the oscillating so- 
lutions, but also those that revolve completely around the pivot point. The 
complete phase space, made up of the curves 


H = #£, 


or p=+tV2I1VE + glcosgq, 


626 Chapter 21 


is shown below, first as [—z, 2] x R and then, more properly, as S' x R. The 
curves like 1, where |E| < gl, already shown in the figure on the previous 
page, represent oscillations (or “librations”’, a la the astronomers). Curves like 3 
(also circles), where |E'| > g/l, represent trajectories where the pendulum swings 





front SlxR back 


[—z,x] xR 


completely around (“rotations”); the rotation in the opposite direction is now 
split into the separate curve 3. In terms of the universal covering space map 
po: R > S', a closed trajectory y: [a,b] ~ T*M is a libration when there is a 
covering map 7Y: [a,b] > R, with po y = y, for which y is also a closed curve, 
while it is a rotation when any covering map y always has y(b) in a different 
sheet of the covering space from y(a). 

In addition to these two families of circles, and the single point (0,0), the 
stable equilibrium point where the pendulum hangs straight down, there is the 
dashed “separatrix”, where F = gl, between the two families of curves. ‘This 
looks like a curve that crosses itself, but it is actually three different curves. The 
first is the single point °, represented on the left by (7,0) and/or (—z, 0), the 
unstable equilibrium point where the pendulum stays straight up. The other 
two pieces, 2 and 2 are the non-compact solution paths that approach this point 
asymptotically, swinging either clockwise or counterclockwise. With a physical 
pendulum one might try to obtain (a large portion of) these paths by starting 
with velocity zero extremely close to the upright point, or try to obtain half 
the path by starting at the bottom with kinetic energy extremely close to 2g/, 
though friction and the required closeness make this pretty futile. 


¢ For a particle acted upon by the force (—w17q1, —@27q2), giving indepen- 
dent harmonic oscillations in two directions, T*M is the product of the phase 
spaces for two harmonic oscillators, and we can choose fj; = 5 (wi?qi? + pi”). 


Liouville Integrability 627 


Each 2-dimensional torus corresponds to a pair of circles, one in the phase space 
of the first factor, one in the phase space of the second factor, and in terms of 
the corresponding two coordinates g!, y*, the equation for a trajectory y will 
be, in typical condensed notation, the conditionally periodic motion 


et)=— O)+ar-t 
y*(t) = y* (0) + wa +t. 
Note that although each torus is compact, only the trajectories on this torus that 


are actually periodic will have compact images, while all others will be dense in 
that torus. 


For a more interesting 2-dimensional example, consider the spherical pen- 
dulum of Problem 3-5 and Chapter 8, where we use the coordinates ¢,@ on 
M = S? — {lowest point} (taking note of the remark on page 479). Choosing 
the mass m = 1 for simplicity, so that the Lagrangian is 


— 517(6? + sin* 6 $7) + gl cos9, 





we find that the Hamiltonian is 


2 2 
Pe Po 
= —~ + ——,; — gl cos. 

217-2]? sin* 6 : 

Along any trajectory we have 
OL OL 
= — =/76, = —=.=/? in? 6 5 

(a) = Po = 55 sin” 6 ¢ 


and pg 1s a constant. 
We will take f; = H and fo = pg. Since the partial derivatives of f; and fo 
with respect to 6, @, pg, and pg are 





4/90 a/d@ d/Apo -—/ Ag 
2 
Pe’ cos 6 Pe 
OE ee inp 0 Lia 0, 
f 12 sin? 6 ee l 
fr: 0 0 0 ie 


the functions f; and f2 can be linearly dependent only at (9, ¢,0, pg) with 
Po cos 6 


2 an? @ = glsin@. 


628 Chapter 21 


Substituting for pg from equation (a) we find that 


§ 

ee I cos 8’ 

and comparing with Problem 1-20, we see that the 1-parameter family of cir- 
cular motions, at a constant angle 0 from the vertical, corresponds to the set 
of (ft, f2)7'(Ea, @) for such linearly dependent fi, f2, where Eg is the energy 
corresponding to angular velocity a. 

‘The inverse images of all other (/1, f/2) are 2-dimensional, with the compact 
ones being 2-dimensional tori, where the coordinates g! and ¢? of a trajectory 
are periodic, with the motion itself exhibiting conditionally periodic motion. 

The family of planar oscillations are excluded from this analysis because our 
coordinates exclude the lowest point for the pendulum bob; even if they didn’t, 
since pg = 0 in these cases, H and pg would not be linearly independent. 


¢ Finally, consider the 3-dimensional example of the heavy top, for which T 
and V are given on page 445. Now M is SO(3), with the Euler angles (¢, 0, w) 
as coordinates. In addition to f; = H, we have the constants of motion on 
page 346, namely fo = (L,ez), the component of the angular momentum L 
along the vertical z axis, and f/3 = (L,ez), the component along ez. 

Checking that f2 and fs are in involution is left as an exercise for the reader. 
It might also be fun to determine just when fi, fo, fs are not linearly indepen- 
dent, and correlate these cases, as well as the cases where Cg 1s not compact, 
with the various special types of motions of the top described in Chapter 9. 


We now want to reconsider, and essentially reprove, Liouville’s theorem in the 
context of this geometric picture. Note that the invariant tori theorem involves 
a single torus Cg, and thus, so far we have only been studying individual tori, 
rather than a neighborhood of one of them. In our figures for the 1-dimensional 
cases on pages 623 and 625-626, each compact Cg has a neighborhood made 
up of compact Cp for b near a, and looks like the product of an open interval 
and Cg. Our first order of business is to prove this generally, in all dimensions. 


PRODUCT NEIGHBORHOOD LEMMA. Under the hypotheses for the 
invariant tori theorem, there is an open disc D C R”, and a neighborhood U 
of C, that is diffeomorphic to the product D x Cg, under a diffeomorphism 
®: D x Cg — U for which f o ® 1s the projection pr; on the first factor, every 
pr; ‘(x) being equal to some Cp. 


DxC;,—2s0 


pr, VAT 


D 


Liouville Integrability 629 


PROOF (Simple but messy). We begin with some preliminary observations that 
are needed to take care of a few details. 


1. We claim that there is an open set V D Ca such that any Ch C V will 
be compact. In fact, using compactness of Ca and the hypothesis that f. has 
maximal rank n on Cg, we see that there is an open set W D Cg, with compact 


closure, on which fs always has maximal rank. Choose an open V with Ca C 
VCVCW. 


2. Now we claim that if Cy C V, then Cy must be closed, and thus compact. 
For otherwise, there would be a point of V which is in the closure of Cp, though 
not in Cp, itself; contradicting the fact that f, has maximal rank at this point. 


3. Consequently, if Cp intersects V, but Cy is not compact, then Cp must contain 
points outside of V, and thus by connectedness it must contain a point of the 
topological boundary bV of V. 


4. If there were non-compact Cp arbitrarily close to Ca, then there would be a 
sequence Cp,, Cp,,... containing points 21, 22,... approaching a point of Ca, 
and corresponding points 41, 42,... in bV. These would have a limit point 
¢ € bV for which we would have f (¢) = a, again a contradiction. 


Conclusion: Some neighborhood V of the compact Ca is the union of compact 
sets of the form Cp. 


Our construction of the angular coordinates g',...,g” on Ca depended on 
a choice of a point p € Cg (the point for which y' (fp) = 0 for all i). To choose 
points 2p in nearby Cp, we consider an open set M C V containing p that 
projects diffeomorphically under f to an open disc D around a, and simply 
choose pp to be the point of Cp that is on DM. Choosing D, and thus QD, 





sufficiently small, each such Cp will be contained completely in V, and the union 
of these Cp will be a neighborhood V" of Cg. Using the pp as initial points for 
the g’ on the Cy, we can thus extend the multiple-valued functions g’ for Ca 
to functions on V’; for convenience they will also be denoted by g’. 


Finally, we consider the 2n functions g',...,0", fi,..., fy and note that our 
hypotheses and the compactness of Cg implies that they will be a coordinate 
system in a neighborhood of Cy. So we just shrink D so that f~1(D) is in this 
neighborhood, and let the union of all the corresponding Cp be U. % 


630 Chapter 21 


Action-angle variables. In the very simple 1-dimensional case of the harmonic 
oscillator, for @ = 1, with the single function f/f; = EF on T*M =R xR, our g 


2 


H = 3(q’ + p’) 





would seem to correspond to the Q! given by Liouville’s theorem, so it might 

seem like a good guess that (gy, FE): R x R — R x R 1s a symplectic map. 
Symplectic is essentially the same as area preserving in the |-dimensional 

case, and we note that (g, E) takes a circular sector having area +a? into 


(y, E) 





a rectangle of area Spa’, as expected.! Moreover, (g, E) preserves area [up 
to sign] in general, since the area of an arbitrary region A 1s given in polar 
coordinates by 


(P) arcad = | rdrady= | d(gr?) rag 
A A 


This special case of the change of variable formula is often given an elemen- 
tary proof by approximating a region by subregions for which the formula is 
obvious from the case of circular sectors. And what this whole argument really 





shows is that our map is symplectic not because the value of the second coor- 
dinate along the circle of radius a is the energy E along this circle, but because 
that value happens to be 


I a 
= (area of the region enclosed by this circle). 
1 


| Since gy is measured clockwise, one might argue that the area of the circular sector 
should be counted as —}ga*, but we needn’t worry about this detail for this intuitive 
discussion. Being symplectic up to sign is as good as being symplectic, so far as preserv- 
ing the structure of Hamilton’s equations is concerned, and in any case, the direction 
of g is actually arbitrarily determined by our choice of 1 or —1 as the generator of the 
subgroup G = Z C R in our proof of the invariant tori theorem. 


Liouville Integrability 631 


This might suggest that we choose this value for more general cases, like the 
pendulum. Aside from the fact that we don’t really want to mess around with 
such geometric arguments, this prescription runs into trouble when we consider 
the Cg for rotations, which don’t enclose an area. Fortunately, there is an easy 





way out, implicit in the calculation of (P) by iterated integrals. When Cy, does 
enclose a region A, so that Cg = 0A, we can use Stokes’ theorem to write 


[ avndg= | a(paqy = | paqg= | pag 
A A 


dA Ca 
So we can define a second coordinate function J on any Cg of U by the formula 


J = — p p dq; the traditional circle on the integral sign reminds us that the 
1 a 

integral over Ca 1s computed by integrating over a closed curve y going once 

around C4, but it will be eliminated once we start writing integrals over curves. 
An even greater advantage of this definition is that it can be used for the case 

where Cg is an n-dimensional torus T”. We consider closed curves y1,..., Yn 

representing generators for the 1-dimensional homology of T”, and define the 

numbers J; for T” to be 


1 : 1 Y1 Y2 
iE [Enw-bfn "SD 
oe | Pe ie a es, 


recalling 6 from pages 511-512. This definition does not depend on the partic- 
ular curve y; that we pick: if y; and y;’ represent the same homology class, so 
that y; — y;’ = 00 for some 2-chain o, then since @ is zero on the tangent space 
of each Cg by the isotropy lemma on page 615, we have 


fo= foes fae=fo=o 
do Oo oO 


ides oi 


Once we have picked 71,...,¥n for Ca, the diffeomorphism given by the 
product neighborhood lemma picks out corresponding curves on each of the 
Cp Cc U. We thus obtain functions J;,...,J,: U — R that are constant on 
each Cp. The J; are called the action variables, since they have the dimensions 
of action (page 464). 


632 Chapter 21 


While Liouville’s theorem gives us symplectic coordinates (Q, f) for which 
the Q’s are integrals, in the case of the more geometric action-angle variables 


(Oi.Jc) = ee ae ee ee 


the J’s are actually constants on the invariant tori, but the first set of functions, 
the g’s, are not integrals. Nevertheless, as in the special case of Liouville’s 
theorem for time-independent Hamiltonians, if (g, J) is asymplectic coordinate 
system, then the solutions of Hamilton’s equations will still be especially simple: 
All the g’ parameter curves, formed by varying one g’ and keeping the others 
fixed, are solutions of Hamiulton’s equations, which implies that H is constant 
along these curves, so that 





0H 0 
dgi 
This means that H does not depend on the g’, so that we have 


0H 
ay, P? = vj(J/i(z),..-, In(z)) 


for certain functions v; (compare pages 587—588 and 619) and thus Hamilton’s 
equations 


si 0H 0H 
Do ae i-=-nzaT)> 
OJ; dg! 
wil reduce to 
g =v, J; =0, 


with solutions (in condensed notation) 


Je=ai, g(t) =b; +vj(a1,...,an)-t for constants a;, b;. 


Of course, we really can’t expect (¢, J) to be symplectic without some adyjust- 
ments, since the definition of the g’s depended on various arbitrary choices, not 
to mention that the f’s themselves could be replaced by linear combinations of 
them. So for our geometric version of Liouville’s theorem we just want to prove 
the existence of certain functions (g',..., @”) for which (9, J) is symplectic and 
for which the g’s serve as multiple-valued coordinates mod 27 on each of the 
invariant tort. 


Notice, however, that we are no longer working on an arbitrary symplectic 
manifold, but specifically on the cotangent bundle T*M, since we use the 1-form 
=) 4 Ped q*. Fortunately, this restriction won’t be bothersome for the me- 
chanics examples that we will first be looking at, so, saving the general treatment 
for later, we outline a proof for the case of T*M; for this proof will also have to 


Liouville Integrability 633 


assume that (q, J) is a coordinate system [this corresponds to (*) in Liouville’s 
theorem; it wasn’t required as an assumption when we were working locally, 
but now we are working only “semi-locally”, in a neighborhood of one of the 
invariant tori]. The n-tuples J = (J1,..., Jn) will now serve as coordinates for 
the various tori Cp, and we will let Ce 0 severe the invariant torus determined 
by a particular value J of J. 

The argument will be similar to that used for Liouville’s theorem, with the J’s 
now playing the part of the f’s. As in part I of Liouvile’s theorem, we want to 
find a type 2 generating function S that gives us the g’s, which means that we 
want the J’s to satisfy 


as 
(a) agi J(q, P)) = Pi, 


so that we can then define the gy’ by 


as | 
(b) ay J(q, p)) = @'. 


Since we are assuming that (g, J) 1s a coordinate system, the implicit function 
theorem gives us a function p such that, for all arguments g and J of gq and J, 
we have 5 : 

J(q, BG J)) = J. 
Also, as in the proof of the product neighborhood lemma, we choose initial 
points #e in Ce lying along an n-dimensional submanifold transversal to all the 
tor1 C;. 
We then define 


sG.F)= [| Y reat = | 6 
y k=l y 


Obviously S(q, y ) is not uniquely defined, since y can wrap around the torus 
any number of times, but using the fact that Ce 1s isotropic, we see that S 1s a 
well-defined multiple-valued function mod 27 [locally, this corresponds to the 
use of equation (3) on page 616 to define S on page 617]. 


Equation (a) is straightforward from the definition of S, and then (b) gives 


es eee (=; 7) = 7 ha snd) = 2x, 


so that the y’ are indeed multiple-valued functions mod 27. 


where y is a curve lying entirely in Ce, 


going from fe to (q,p(q,J)). 


634 Chapter 21 


The final step [an analogue of part II of Liouville’s theorem, but with the J’s 
instead of the f’s], is to show that the @’s are actually coordinates on the tori, 
i.e., that each yg! parameter curve lies on some torus. For this we basically just 
reverse the reasoning on page 632: what we need to prove is essentially the 
same as showing that H is constant along these curves, 


OH 
dgi 





But this follows automatically: If y is any solution curve, then J;(y(t)) 1s con- 

stant, since y lies on an invariant torus, so 

_ @Si(y(t)) _ 0H 
dt dg! 





0 (y(Z)), 


and there is some solution curve y(t) through any point of T*M. Q.E.D. 


At the end of all this, note that S is defined as an integral, and hence the g’ 
and our equations are solvable by quadratures. 


As a final remark before looking at two simple examples from mechanics, note 
that since (g, J) is a symplectic coordinate system for T*M, the J; will be the 
generalized conjugate momentum to the g’ for the Lagrangian L corresponding 
to A, 91 


a6 
Now L has the dimensions of energy E = MV?, as on page 498, while the g' 


are dimensionless, so that the ¢! will have the dimensions T~'. Consequently, 
the J; must have the dimensions ET of action. 


Ji = 


¢ For our first example, we consider once again the sorely put-upon harmonic 
oscillator, with 


H(q, p) = 3(p* + 7q’). 


In this simple case, the curve H = E fora constant E 1s just the ellipse 





wg? p? ioe 
ene E 
2E 2E V2E 
w 
with area equal to 27 E/w, so we immediately know that 
1 WE E 0H 
Pee eas or H=o/J => — =a, 

2H w w OJ 


and the solutions all have circular frequency @, or ordinary frequency w/2z. 


Liouville Integrabiltty 635 


This is obviously too cute by half, but it does wlustrate the important general 
point that knowing the action-angle variables allows us to find the frequencies 
without having to solve the equations of motion themselves. 

If we don’t want to take advantage of the special circumstance of this problem, 
then we must go right back to our calculations in Chapter 18, pages 550-551, 
where we wrote the solution S of the Hamilton—Jacobi equation as S(q,a@,t) = 


W(q,a)—at for 
2a 
W(q,a) = o| V52 — q* dq. 


Note, from the first two equations on page 585 of Chapter 19, that this W is 
precisely the S in the proof that we sketched on page 633. We can thus write 


1 1 ow 
J =a pag = > a! by (a) on page 633 
y y 
(only looking like 
Ne eos g2 dq computational 
w? double-talk) 

aL 3 using the substitution 
= — dé 

1 [ <0 : = ¥2a/w? sin 6. 


Inserting the limits of integration that correspond to a closed curve going once 
around the imvariant torus, we can compute that 


20 
sof cos? 6d0 = —, 
TW Jo 60) 


and we again obtain H = wJ, and thereby determine the frequency of all orbits. 
When we actually calculate W, say by writing out the formula for { cos* 6 46, 
equations (a) and (b) on page 633 can be used to obtain g and p. The results are 





V2E 
q= sin WY, p= V2E coswg, 
Ww 


where the first formula agrees with formula (b’) on page 551, while the two 
formulas together illustrate the relationship indicated at the top of page 625. 


¢ For the inverse square force in polar coordinates (r, 8) we have to find the 
functions J;, Jg on the 2-dimensional tori. For any potential function V we 
have the general formula (page 552) 


Wr, 0, a, 09) -| \/ 2m(a — V(r) — = — re + af, 


636 Chapter 21 


and we get 


ag? 
Y 
l ] 
Jo= =~ | pedo = = | aod =a, 
Y 


i 


I 
NO 
y| > 
—, 
A 
* 
Q. 
~ 


20 
y 


illustrating the general fact that when we have separated the variables in the 
Hamilton—Jacobi equation—usually the only way that we can solve it—the inte- 
grals for the J; simplify, containing just one p; dq' apiece (compare page 640). 
Problem 2 gives some further consequences. 


For an inverse square force V(r) = —k/r, we have to compute the integral 
I 2mk a 
f= 5 f 2me+ as 
21 r r2 
v 


A computation using rather interesting elementary trickery specifically related 
to the problem is given in Problem 3, but, following the lead of Sommerfeld [1], 
the computation is usually effected by contour integration (Problem 4), yielding 


Jy = —Ag +k J/—-—m/2E 
=—Jg +k/—m/2E. 


We can write this as 
= —k*m 
WS + Jo)?’ 


which brings up a second important point. Obviously 


oH _ OH 
J,  dJo 





and thus Vp = Vp: 


This means that the periods for the variation of r and @ are the same, and 
it follows that all orbits must be closed. ‘This situation is called “degeneracy”, 
and would also be the case for any rational relation between the periods [in 
dimensions n > 2 we can have different degrees d < n of degeneracy, with 
d =n guaranteeing that all orbits are closed]. 

For a general central force in 2 dimensions, we usually don’t have degener- 
acy, and most orbits are dense in the tori, instead of being closed curves. In 


Liouville Integrability 637 


such non-degenerate cases, the tori are well-determined, no matter what initial 
functions fi,..., f, we might start with, since they are simply the closures of 
orbits. But in degenerate cases they usually will not be well-determined—they 
are tori of dimension less than n, which can be contained in different ways in 
n-dimensional tori. For example, for an inverse square force, we can separate 
variables not only in polar coordinates, but also in the elliptic coordinates of 
Addendum 18A, and these lead to different sets of tori. The situation is similar 
for the planar harmonic oscillator on page 626, where we can separate variables 
in polar coordinates as well as in cartesian coordinates. (Nice exercises for the 
eager reader.) 


Action-angle variables on symplectic manifolds. Finally, we end with a proof of 
the action-angle variables theorem on symplectic manifolds in general, without 
any additional assumptions. While the proof is more abstract, it is very direct, 
and only the disembodied ghost of the generating function remains. 


ACTION-ANGLE VARIABLES THEOREM. Let (M,w) be a symplectic 
manifold of dimension 2n with functions f},..., fr, H: M — R satisfying the 
hypothesis of the invariant tori theorem, and let D be an open disc D C R” 
and U a neighborhood of an invariant torus Cg given by the product neighbor- 
hood lemma. Then there is a symplectic coordinate system 


(Onis @ Sisteida) 
on U for which the g’s are multiple-valued coordinates mod 27 on the various 


tori Cy C U and the J’s are constant on these tori (automatically implying that 
dH /dy' = 0, as on page 634). 


PROOF. It will be convenient simply to identify U with D x Cg via the product 
neighborhood lemma, which tells us that for the projection 


the inverse image zp '(x) is an invariant torus for each x € D. 

We let X; be the vector fields used in the proof of the invariant tori theorem, 
while ¢! will now be used to denote the g’ we obtained in that theorem, so that 
(9',...,@", fi,..., fn) is the coordinate system for U determined in the proof 
of the product neighborhood lemma. Then (cf. the remark after the proof of 
the invariant tori theorem) we can write 


A n 
k=1 





dp 

for certain functions a;, that are constant on each invariant torus. 
In the expression for the 2-form w in the (¢, f) coordinate system, there are 
no terms involving the dg! A d@/, since w = 0 on the invariant tori. For the 


638 Chapter 21 


coefficient of the dg! A df; term we have, since X;%_1@ = dfx, 


0 a 
o(srag) = Yano (Xe. 37) = Yan die (sr) =a 


and thus P j 
o= ) ajdg' adfi + > bi dfi A df; 
| i,j=l 


for certain functions b;;. Since dw = 0, we have 


Ob; ; Oak j Oak; 





age Of; Of,” 


and the right side doesn’t depend on the @', so the derivatives of b;; are constant 

along the @' parameter curves, and thus must be 0, since the parameter curves 

are closed. Thus the 5;; as well as the a;; are constant on each invariant torus. 
Now write @ in the form 


one (Lava) + eS died 


i=] i,j=l1 





wre es. for A; = Lay di, B= Yo nya nas 


i=1 i,j=1 


Since the a;; and b;; are constant on each invariant torus, we can regard 4A; 
and B as forms on D,1Q.e., there are 1-forms a; and a 2-form B on D such that 


Aj = Tp Mi, B= Tp’ Bp. 
from 
0=dw=) dg’ Anp*da; + np*dp 
i=1 
we can conclude that da; = 0 and dB = 0, which implies that there are 


functions J; and a 1-form y on the disc D for which! 
Of = dl; Bb = dy. 
Finally, we set 
J; = j onp) = xpi, noting that dJ; = Aj. 


| We now have w = dd for } = —>_, (i o mp) dG! + ap*y, suggesting a proof 
analogous to the case of T*M, but we follow a somewhat different route. 


Liouville Integrability 639 


Since we have n 
w=) dg' rds + B, 
i=l 


the matrix of w in the (¢, f) coordinate system is 


0; oe 
| Of; 
J, 
oa = bij 


Since @ is nonsingular, the determinant of this matrix is nonzero, so the same 
must be true of the determinant of (dJ; / afi), so (¢, J) is also a coordinate 
system. 


We now adjust for the arbitrary choice of the origins of our angular coordi- 
nates in the proof of the product neighborhood lemma by writing y as 


n 
y= > gi dI;, and then setting gp =G' + (gion). 
i=] 


This gives us 


n n n 
\ dy! Adi =) dg! Add; + Y d(gi ony) Add; 


1=1 i=1 i=] 


n n 
= Didg' nd, + Yi d(giomp) Ad(Ii ox) 


i=] i=] 


n n 
= > dg AdJ; + mp dy = > d¢' AA; + B=a. 
i=1 i=1 

‘Thus we have found symplectic coordinates (¢g, J) with the J; constant on the 
invariant tori, and with the 9’, like the ¢', multiple-valued functions mod 2 
on these tori. 

To make our result resemble the T*M case more closely, we might make use 
of the footnote on the previous page, or we can note that 


w=d0 for @=->-_ kde’, 


l 
IS =f, j**\ 
20 


Vi 


and thus 


where the y; are the parameter curves for the gy’, but now traversed in the 


reverse direction. ¢% 


640 Chapter 21 


Background. At the end of this long development, we give a brief historical 
account, which will indicate how the various strands of our theorem came about 
and were woven together, and also lead us into the final chapter. 


e The source usually quoted for Liouville’s theorem, Liouville [1] of 1855, was 
basically just an announcement of the result. ‘The first published proof I know 
of occurs in Whittaker [1; §148], originally published in 1904, with a note stating 
that the theorem is “essentially the application to Hamilton’s partial differential 
equation of the well-known method for finding a Complete Integral of a non- 
linear partial differential equation of the first order.” 

Perhaps in line with this view, in Whittaker’s proof the step establishing equa- 
tion (3) on page 616 relies on a forbidding looking thicket of previous results, 
and the rest of the argument doesn’t look especially inviting either. Our proof 
is based on Pars [1; §22.14 and §25.7] of 1965, except that Pars’ proof for equa- 
tion (3), though straightforward, is nevertheless strangely complicated; the sim- 
plification used here comes from the paper Jost [1], which will be mentioned 
again a little later on, in a more important context. 


e Action-angle type variables were first used in 1860 by Delaunay [1] to study 
the motion of the moon, and for a long time the “Delaunay variables” (for 
which one may consult Abraham and Marsden [1], Boccaletti and Pucacco [1], 
and Fasano and Marmi [1]) were used almost exclusively by astronomers. 

Early in the 20 century, however, they aroused the interests of physicists 
when Sommerfeld noted that Neils Bohr’s quantum conditions on the orbit of 
an electron around the nucleus could be formulated as requiring the action 
variables for the orbit to be an integer multiple of # = h/2z (the “royal road 
to quantization” in Sommerfeld [1]). Einstein [1] in 1917 first drew attention to 
the integrals i >; Pk dq* in the non-separable case, and by 1924 a thorough 
presentation of action-angle variables was available in Max Born []]. 

The “action-angle” terminology itself was introduced by Schwarzschild [1] in 
1916; an English translation may be found in Duck and Sudarshan [1], which 
also describes the harrowing circumstances under which Schwarzschild worked 
(while also obtaining the famous Schwarzschild solution in general relativity). 

Finally, we note that the action-angle variables had always been used in con- 
nection with the Hamilton—Jacobi equation, rather than integrals in involution. 
Audin [1] cites Mineur [1] of 1936 for the first treatment of the latter case. 


e The invariant tori theorem first appeared (in Russian) in Arnold [1] of 1963, 
and then in Arnold and Avez [1; Appendix 26] of 1968, which begins by noting 
“It was pointed out long ago that ... the manifolds ... turn out to be tori, and 
motion along them is quasi-periodic... ” (unfortunately, no mention is made 
of who did the pointing). ‘The theorem then made its way into Arnold [2] of 


Liouville Integrability 641 


1978, the usual reference, as part of a long exposition in Chapter 10, modestly 
entitled simply Liouvzlle’s theorem on integrable systems. ‘The physicist R. Jost gave 
the first treatment for arbitrary symplectic manifolds in Jost [1] of 1968, where 
the author likewise modestly suggested that it might be regarded merely as a 
“sort of commentary” on Arnold’s proof. 

In addition to the path provided by Jost’s paper, other approaches to the 
modern theorem have been given, and considerable tinkering has been applied 
to the details, so the basic ideas appear in numerous guises. The proof given 
here follows Bolsinov and Fomenko [1], which also points to other approaches. 

Although the tori provided mathematicians with a wonderful conceptual sim- 
plification, physicists, as pointed out previously, had simply discussed things in 
terms of multiple-valued functions. For example, in the 1-dimensional case, a 
rotation y was typically pictured not as a circle on S! x R but as a repeating 





ees * , 


function on R x R, as in (a) from Born [1]. In other words, physicists basically 
looked at a covering map y in the universal covering space. Similarly, Born’s 
book contains a representation (b) of the generators of the discrete subgroup G 
on page 622, and its partition of R”, essentially describing the universal cover- 
ing map from R” onto these unmentioned tort. 


e In any case, the initial excitement for physicists died down quite rapidly, as 
it soon became clear that despite the striking initial success of the Bohr theory 
of the atom, it was inadequate, eventually to be replaced by wave mechanics. 

Nevertheless, aside from the fact that action-angle variables are immenscly 
important for modern advances in “classical” mechanics, other ideas involved 
in analyzing the Bohr theory also became important. In particular, one of 
the pot-holes in Sommerfeld’s royal road to quantization was the fact that it 
only worked when one used the “right” action-angle coordinates, raising the 
question of how one could characterize these coordinates. ‘This led to the idea 
of adiabatic invariants in mechanics, which we will consider in the next chapter. 


As a result of the mongrel pedigrees for these results, the theorem on invariant 
tori often bears the name of Liouville or the combination Liouville-Arnold, 
and the action-angle variables theorem can similarly be found labeled with some 
combination of Liouville, Arnold, and Jost (with Mineur lurking in the wings). 


With some justification, or perhaps just out of exasperation, one might simply 
end up calling this whole circle of ideas the theory of Liouville integrability. 


642 Chapter 21 


PROBLEMS 
1. Let G be a discrete subgroup of R”. 


(a) If G ¥ {0}, consider an element b; of G that is closest to 0, and let V; be 
the subspace generated by bi. Show that GM V, is generated over Z by by. 
(b) If G contains any elements not in Vj, show that it contains an element 
bz closest to V;. (Hint: For any element not in Vj, there is another the same 
distance away that is also close to 0.) 

(c) If V2 is the subspace generated by V; and bz, show that G/M V2 is generated 
over Z by b, and bo. 

(d) Continue this process inductively to show that G is generated over Z by 
k <n elements (note: the induction is on k, not on n). 

2. Let C C R? be the closed curve C = f~!(0) where grad f = (fx, fy) #0 


on C, and let v be the outward pointing unit normal on C. 


(a) Show that for some 5 > 0 we have (grad f,v) > 6 on C, and use this fact, 
together with the fact that a tubular neighborhood of C can be defined in terms 
of v, to conclude that for J(h) = area enclosed by f~'(h), we have 


me wa 


matt 
Thus, if we use f as a coordinate for a ioaieinael of C, then 
det(dJ/af) 40 on C. 


(b) In the examples on pages 550-553, after fixing a = E, separation of vari- 
ables for the Hamilton—Jacobi equation for W put it into the form 


Wi(q',01,...,0n) + Wo(q*,a2,...,0n) +-+> + Walq", on), 


where qj; is the constant value of p;; this means that each J; will depend only 
on fj,..., fn. Show that whenever we can write W in this way we have 


det 5 = ie 


so that (g, J) will be a coordinate an 
3.! Writing the term / in the definition of J on page 636 as (/ __ )’ dal 5 
we get 

211 E+k/2r k/2r ag? /2mr? 


vim J VE+k/r —ag?/2mr2 "J Juve : Joes 





| From Calkin [1]. 


Liouville Integrability 643 


(a) The first integral is zero over the closed path y. ; 
(b) Problem 4-11 introduces the true anomaly @ and eccentric anomaly @ with 


(1) r/a =1—ecos8, 
(it) a(1—e*)/r =1+ €cos@, 


where, by the equations on pages 124-125, 


a= —k/2E, 


e= V1+4+2a92E/mk?2. 


Using (i) as a substitution for the second integral and (11) as a substitution for 
the third, show that 


2nI = V=mi72E | a6 _ ap | de 
Y 


Y 
— 2m (v —mk?/2E — ag). 


4. The integrand /--- in the definition of J on page 636 has two real roots 
0 <r, < rz, so to obtain a complex analytic function we have to make a slit 
in C along the segment [r, r2| of the real axis. 


(a) Since /:-: = py = mr, the trajectory of the particle is a path going from 
r =r, tor = rz and back again, so we have to integrate counterclockwise 
along a path surrounding [rj, rz], as in the figure, where the rows of + signs 
and — signs show the proper sign for the square root on the two halves of the 












ative i - positive 





1 + neg 





orbit. Along {x > rp + 0-i} the values of \/--- will thus be numbers iy for 
real y > 0, while along {x < r; + 0-1} they will be iy for real y < 0, and the 
integral will be the negative of the sum of those around the clockwise circles. 

(b) The residue of —./--- at 0 is —iag, and the residue at 00 is ik,/—2m/E, 


leading to the same formula for J; as before. 


CHAPTER 22 
EPILOGUE 


he theory of Liouville integrability may be regarded as the culmination of 

classical mechanics, neatly packaged and gift-wrapped, which encompasses 
virtually all problems that could be solved classically. It is presented, in various 
forms, near the end of many classical treatises on mechanics, often followed by 
some energy-intensive investigations into the stubbornly intransigent three body 
problem, which mainly seemed to suggest that classical mechanics had reached 
the point of diminishing returns. 

Eventually, completely new mathematical methods were developed to inves- 
tigate nonintegrable systems, leading to a revitalization of classical mechanics. 
However, these developments in “modern” classical mechanics would easily fill 
another entire book, which may perhaps appear some day as Mechanics II. 
For now, we will merely follow a few leads from the previous chapter, to give a 
sense of some later developments, with a rather relaxed attitude toward rigor. 


Adiabatic invariants. ‘The word adiabatic, from Greek a- (not) and é:aBaivw = 
pass through, is used in thermodynamics to indicate a process in which no 
heat enters or leaves a system. By a strange route, the term migrated from 
thermodynamics to mechanics, where an adiabatic invariant is a quantity that 
remains almost invariant when some parameter of a system changes very slowly. 

Probably the earliest example of an adiabatic invariant in mechanics is due 
to Lord Rayleigh [1], who showed that if the length / of a pendulum is changed 
very slowly, then the ratio E'/v of the energy of the pendulum to its frequency 
remains nearly constant; more picturesquely, if the length changes infinitely 
slowly, then the ratio remains constant. In the situation shown in the figure, 





mass Mm 
the tension 7 of the pendulum string is also the force that must be exerted in 
order to keep the pendulum from sliding down, te., the force needed to keep 
the length / unchanged. 

We are considering the case of small oscillations, so that we have the approx- 
imation, which we sloppily write as an equality, 


= 1 92 
cos? = 1— 56°, 


644 


Epilogue 645 


and thus for the tension T we have (as on page 210) 
T =mgcos6 +ml6’? = mg — tmg6? + milo”. 


In order to pull the pendulum bob up extremely slowly we will need to exert 
a force F just a tiny bit larger than 7, and we will simply take F = T. In 
one complete period the length of the string changes by a very small amount, 
/ + 1+ 61 (where d/ <0). If ( ) denotes an average over this period, the total 
work done on the pendulum system over this period will be very close to 


—§1-(F) = —81-(T) =—mg - 61 + Amg - 61(62) —m- 81(6"). 


The —mg - 6] term is the increase in potential energy, and the remaining work 
appears in the increase of the kinetic energy of the oscillating pendulum, 


SE = 4mg-81(62) —m-81(0'2). 


We can save ourselves from more involved calculations (compare Problem 1) 
by noting that the average valucs of the potential energy and the kinetic energy 
of the pendulum over a period are the same, and thus equal to half the total 
energy E, so that 
MM /2(g!2) = = 
2 2 
Dividing by / and substituting these values into the formula for 5E then gives 


mt 61162) = 
581 (0?) = 


E 


which leads in the limit as 6/ — 0 to the equation 
dE idl 


ie ek —> E = constant -/~1/?, 


and since the length of the period of the pendulum is 27 ///g, we thus have 


E = v-constant. 


This argument is taken! from Born [1]. Of course, one might have a sense of 
unease about this derivation, which doesn’t take into account the possibility that 
the tiny crrors introduced by taking the average over a period, during which the 
length of the pendulum 1s changing by a very small amount, might accumulate 
into something significant over the very large number of periods. But Rayleigh 
and other physicists certainly didn’t worry about it. 


'T can’t make any sense of Rayleigh’s argument, which, among other things, ignores 
the ml0’* term in T, but still comes up with the same result, though later on, when 
formulating problems in terms of Lagrangian mechanics, Rayleigh includes this term. 


646 Chapter 22 


Rayleigh’s pendulum, in a slightly different form, was but the first of a series 
he discussed before using a generalization to derive the Stefan—Boltzmann law 
for black body radiation, which Boltzmann had derived from principles of ther- 
modynamics and electromagnetism. Paul Ehrenfest, a student of Boltzmann, 
extended Rayleigh’s idea to other questions of thermodynamics and later on he 
was led to apply ideas about adiabaticity to [the old] quantum mechanics. ‘This 
resulted, by a tangled path! (in which a pivatol role was played by Einstein, 
virtually the only physicist who had paid any attention to Ehrenfest’s work), 
in what became known as Ehrenfest’s “adiabatic hypothesis”: since quantum 
transitions are caused only by influences that vary very rapidly, like light and 
molecular impacts, the only quantities that should be quantized are those that 
remain constant under the influence of ordinary phenomenon that vary much 
more slowly; these quantities were then dubbed “adiabatic invariants”. 

In view of Sommerfeld’s derivations of Bohr’s quantum conditions by the 
quantization of the action variables J;, this suggested that the action vari- 
ables should have this property of being adiabatic invariants, which was in 
fact demonstrated, with about the same degree of rigor as Rayleigh’s paper, 
by Burgers [1], a student of Ehrenfest. 

In the end, most methods of the old quantum mechanics became irrelevant, 
but adiabatic invariants play an important role in modern quantum mechanics 
as well. In the following sections, by contrast, we will only be considering some 
aspects of adiabatic invariants in classical mechanics, with a little more attention 
to mathematical proofs. 


The averaging principle. Integrable systems are quite special, and classical me- 
chanics might well owe its existence to the lucky fact that many important simple 
systems are of this type. But eventually one has to deal with more general sys- 
tems, and the basic approach is by considering perturbations, systems that are 
very close to integrable ones. For example, starting from a Liouville integrable 
system for some Hamiltonian A, written in condensed notation as 





r 0H 
J; =, po =i) = J =(Ji,...,J5n), 
OJ; 
we can consider a nearby “perturbed” system, again in condensed notation, 
(a) fi=eh@J), GF =viVJ) + esig.J) 


for ¢ < 1, where f; and g; are periodic with period 27 in the @ variables, 
which are functions on S!. 


| See Laidler [1], Mehra and Rechenberg [1; 230— 237], and the biography of Ehrenfest 
by Klein [1]; detailed examination of Ehrenfest’s work is given in Navarro and Perez [1], 


[2] and Perez [I]. 


Epilogue 647 


More generally, we can consider a system of n + m equations of the form (a) 
where f and g are functions of (g, J) for n functions gy’ on S! and m functions 
J; on R. Note that the J; are increasing slowly, at the same rate as the very 
small ¢, while the g! are increasing at a large rate compared to € (provided 
vi # 0), thus going through many cycles as the J; slowly changes. So we might 
hope that a good approximation to the J; can be found simply by solving the 
“averaged equations” 


X=cf(X)  thatis X;=eff(X¥) i=1,...,m, 
where f = (fi,..., fn) is the average of f(g, X) over a complete cycle of 
the g’, 
n 1 20 20 
fO=se ff s@.xX)dg!...ag". 
(27)”" Jo 0 


In other words, we might hope that the averaged equation has a solution J that 
is close to the solution J of our original equation for small ¢. We don’t intend 
to get into the study of perturbation theory here, but it is hardly surprising that 
this approach might also be useful for studying adiabatic changes. 

To get some idea of how J compares to J, we consider the simplest example, 
with n = m = | and v £ 0 constant, as well as g = 0, so that we simply have 


I= @), C=, 


He = 1 21 
and g(t) = y(0) + vt while f is just the constant f = an f(y) dy. The 
averaged equation 1s then - ae 
», aoe 
and the solution with the same initial condition as J is 


Jt) = J(0) +e ft. 


Setting h = f — f, we can write 
J(t) — J(0) = 0 dt = fed h 0 d 
(1) — 1(0) ef f(@O) + vt) dt ef j +e | (0(0) + vt) dt 


y(0)+vt 


=eft+ | h(u) du, 
V Je(0) 
so that 


_ e y(0)+vt 
J(t) — Je(t) = = | h(u) dt. 
V Je(0) 


Since / is periodic, and thus bounded, we have 


|J(t) — Je(t)| < ec 


648 Chapter 22 


for some constant c. For small € the function J is thus the sum ofan “evolution” 
part J, with derivative ef, and a small oscillatory part. 


J Je 


A more specific example, showing the difficulty of a direct analysis, is given in Wells 
and Siklos [1], which considers a slow change of parameter for the harmonic oscillator, 


H(q, p,w(t)) = 3(p? +.o(t)*q*), — for a(t)? = 1+ et 
(we might imagine, for example, that w 1s varied by slowly heating an oscillating spring). 
For x = —e~?/3(1 + ef), the harmonic oscillator equation g + (1+ et)q = 0 reduces to 
gq” —xq=0, (q” =d*q/dx") fie, g =qox ' satisfies g”(x) = xg(x) = 01] 


This is the Azry equation, whose solutions can be written in terms of the two special Airy 
functions Ai(x) and Bi(x). Using known asymptotic expansions for the Airy functions, 


Wells and Siklos show that 

: 20° —1 

J = H/w = 4 + 4ew~* sin 8 cos + O(e") for 0= = 
E 

so that in this case the evolution term is 0, and J is clearly an adiabatic invariant 

(although we haven’t yet defined exactly what that means). 





The “averaging principle” asserts, or assumes, that the motion of equation (a) 
is always such a sum of an “evolution” part, obtained from the averaged equa- 
tion, plus a small oscillatory part. It is only an “aspirational” principle (it ain’t 
necessarily so), and we will examine just one, quite special, case where it is true. 


An averaging theorem for one-dimensional systems. We consider the system 
J,=0, @=v(J) i=1,...,m 


for a given initial value J(0), where g is defined on S', while J is defined on 
some open set G C R™. The perturbed system 1s 


(1) J=ef(y.J), ¢=v(J) +egy,J) 

with f and g periodic of period 27 in g, and the averaged system is 
: - _ 1 20 

2) K=efO, F0O=5-] > s@.xXae. 


We denote the solution of (2) with X(0) = J(0) by Je. ; 
Our aim is to show that for sufficiently small ¢, the function J, 1s close to J. 
Moreover, we want it to be close for a long time, 0 < t < I/e. 


Epilogue 649 


Our theorem wil have three basic hypotheses. ‘The first will be some regular- 
ity conditions on f, g, and v. The second will require that v is bounded away 
from 0 on G; for example, in the case of the pendulum, we must stay away from 
the separatrix, where we no longer have periodic orbits. ‘The third hypothesis 
is the kicker: We want to assume that for sufficiently small e 


J(t)€G forall 0<t<1/e, 


which at first sight seems to be unrealistically demanding, 

One case where it presents no particular problem is the simplest example 
examined on page 647, where J,(t) = J(0) + eft for the constant f; we only 
need that the segment from J(0) to J(0) + - lies in G. The situation is even 
simpler if f = 0, so that J; = J(0) for all ¢! As a matter of fact, the very 
situation where we will be applying our theorem is one where f = 0; in other 
words, we have a very restrictive theorem, which just happens to be enough for 
our purposes. 

For convenience, from now on we adopt the standard abuse of notation, and 
simply write J instead of Je, so you have to remember that the ¢ 1s really there. 

For technical reasons, we also need to introduce one slight complication. 
Given a (small) number d > 0, we will let Gg denote the points of G for which 
the open disc of radius d around that point is contained in G (alternately, Gg 
is the set of points at distance > d from the boundary of G). We will need to 
consider Gg rather than G itself, because J might still be in G at a time when J 
has already passed over the boundary of G, so that the equations aren’t even 
defined. 


Finally, we will use the following lemma. 
LEMMA. Let x be an R™-valued C! function, and let a,b > 0. Suppose that 
Ix"(t)| < alx()| + 8, 
using | | for the norm in R”. Then 
Ix(¢)| < (|x ()| + bt)e™ 
PROOF. If y satisfies 
yit)=ay+b, — y(0) = |xO)], 


then |x(t)| < y(*). But 
y(t) =C(t)e* 

for some function C, and we find that C’(t)e® = b, or C’(t) = e~%b; since 

C(0) = |x(0)|, we thus have C < |x(0)| + bt. % 


650 Chapter 22 


SIMPLE AVERAGING THEOREM. Let G Cc R” be an open set, and let 
d > 0. Consider the solution J of the equation 


(1) J=ef(y,J), ¢=vJ)+eg¢,J) 


on S' x G, with f and g periodic of period 27 in 9g, for the initial value J , and 
the averaged system 


20 


- - 1 
2) K=ef%), FO=s] f.X)dg 


(e) 
on G for the same initial value J. 


Suppose that 


(1) the functions f, g, and their partial derivatives are bounded on S! x G, 
and similarly for v on G; 


(2) the function v is bounded away from 0 on G; 
(3) for sufficiently small ¢, the solution J of (2) is defined on [0, 1/e] and 


J(t)€Gqg forall O<t<I/e. 
‘Then for all sufficiently small ¢ we have 


max |J(t)— J(t)| = O(e) 
te[0,1/e] 


(implicit in this equation is the fact that J(¢) is defined on (0, 1/¢]). 


PROOF. If we set 
P(g,J)=J+eS(y, J) 
for a function S, then equation (1) gives 
Pages pees 
ER OY 
0S 0S 0S 
=e] f(g, J) +v(J)— | + e?g— +e° f—. 
dp dp OJ 


In particular, we will choose 


ge 7 
Sure [ CHAR ICEO 


=m 
v(J) 
so that ac ; 

f(g, J) + ear va ye) 


Epilogue 651 


and we then simply get 
P =ef(J) + O(e) 
= ¢ f(P) + O(e”). 


J=sef(SJ), 
we see that x = P — J satisfies 


Ix'| < eAlx| + O(e°), 


Comparing with 


where A is a bound for the derivatives of f. 
The Lemma then shows that there is a constant C such that 


(x) IP(t)-—J(t)|< Cee" fort = O(1/e). 


Since we also have |/J(t) — P(t)| = O(e) for all ¢, it looks as if we have the 
conclusion in the statement of the theorem. 

However, there is one delicate point. This argument is only true for t < T, 
where Tt is the first time that P(t) hits the boundary of G (if ever). So we have 
to show that r 1s at least of order 1/¢. To do this, we note that (*) umplies that 
for small enough ¢ we have |P(t)— J (r)| < d/2. So P(t) € Gq for t € [0, 1/e], 
and this implies that J remains in G on the same interval, for small enough e. 


Thus, J, P, and J all stay in G for times of order 1/e. % 


This theorem first appeared in Arnold [2; §52C, D]. The proof given here 
is based on a simplified proof in Lochak and Meunier [1], where a somewhat 
more general version is given, just the beginning of a very extensive coverage. 


Adiabatic invariance of J. Now we will apply this result to study a Hamiltonian 
H(q, p, 4) with a parameter, for a 1-dimensional system. It is convenient to in- 
vestigate slow changes of the parameter by letting A be the “slow time”, defined 
by A = et for small ¢, and then considering the solutions t + (q(t), p(t)) of the 
system 


_ OH 0H 
q= PA ed p= gar ae) 


with given initial conditions (p(0), ¢(0)). 


DEFINITION. A quantity A(q, p,4) 1s called an adiabatic invariant of this 


system if 
lim max |A(q(t), p(t),et) — A(q(O), p(0),0)| =0. 


ée—>0 re[0,1/e] 


In other words, keeping the equations close to the original equations will keep 
the solutions close for a long time. 


652 Chapter 22 


Our aim is to show that when the Hamiltonians H(q, p, A) have action-angle 
variables, the action variable J(g, p,A) 1s an adiabatic invariant. We are now 
working on a cotangent bundle T*M (presumably T*S'), so, as on page 633, 
we have the function J(q, p,A) and a type 2 generating function S(q, p, A) 
(multiple-valued mod 277) with 


0S 0S 
Bareieas 1 area 
(1) sare (9.544. PAA), B= FG. IG P.A),A) 


However, since A = ef, this is now a time-dependent canonical transformation, 
and the new Hamiltonian has the form 
os CRY 


(2) Hota = Ho bes, 


where Ho is H(q, p,0) expressed in terms of the (¢, J) variables [i.e., composed 
with g~! for the map g(q, p,A) = (¢(q, p, A), J(q, p,A), A) for @ given by (1).] 
Note that although the function S is multiple-valued, the function 0S/0A 1s 
single-valued, and periodic. 

Hamilton’s equations for this new Hamiltonian are 


0 (OS 
(3) h=e 
0 (oS 


We want to apply the averaging theorem, for the case m = 2 (with J and A 
being J; and J2), to the averaged system 


Xi =ef, X2 =€ 


to get a solution (J, A) with initial conditions (J(0),0). Of course, A = ef, 
which is no news, but 

- 1 27 a (aS as 

_ = (sr) dy =0 since =~ 1s periodic, 


—2n Io dv \ OA aA 


so J is constant, J(t) = J(0), and we can indeed apply the averaging theorem, 
which then says that 


Het | |J(t) — J(0)| = O(e). QE.D. 


te[0d, 


Epilogue 653 


Actually, this proof isn’t completely rigorous, as it depends on the additional 
assumption that we had to make for the construction of action-angle variables 
on T*M. However, we won’t take the time to worry about this detail. The 
remainder of this chapter is presented somewhat in physics mode, and we relax 
our concern with rigor a bit (it makes the physics so much more fun!). 


The Hannay angle. ‘The generating function S in equation (1) was chosen to 
give the transformation (q, p,A) +> (g, J,A) in our first proof of the existence 
of action-angle variables. It will now be somewhat more uscful for us to consider 
the inverse transformation (J,g,4) +> (p,q, 4), where we have also flipped the 
order of the variables, and a type 2 generating function S for this transforma- 
tion. Consulting the recipe given at the bottom of page 580, and remembering 
that QO and P are now p and gq, while gq and p are J and ¢, we obtain the 
function g(J,@, A) with 


. as as 
(1") p= ag J, 4%), Q= az 4. @.A), J, A). 


Instead of setting A = et, as before, now we simply let A = t, and we write the 
new Hamiltonian analogous to (2) a bit more carefully 


0S 

(2") Ho =F a, A et) J,t), 
with Hamilton’s equations for this new Hamiltonian being 

; 0d (0S 

J=fU,9,0) f=- 7 (Fasvn40)] 
(3") 

p= v(J,t)+ g(VJ,¢,F) a. 2 t), J,t) 

CU: 8,9, BSS a Na ee ee) 


0S 
In all of these more carefully written equations, a simply denotes D3S, the 
partial derivative of S with respect to its third variable, and a, a ,Q,t), J,t) 


must be distinguished from 


a(S(q(J,¢,t), J,t)) 
ot 


For the latter we have 


KS as 5g as 
— (J, Jf as aes J: ,t), J,t)—V(V, yt ore J t).J, 
ay (hoot) aq “4 Gt), F,)— (4.0.1) + = Get), Ft) 


os 
L€., ay et) for S(J,9,t) = S(q(VJ,¢,t), J,t). 


0g 0S 
= — (J, at —- J, st), J), 
p> ( 1 )+ ala g,t), J,t) 


654 Chapter 22 


so that we can write, with some arguments omitted, 


ORY as) 0g 
4 —(q(J,9,t),J,t) = — -— p—. 
(4) 5, A Pt), Jt) = a — Do 
In particular, suppose that our system returns its original state after some time T. 
We will then have 


T 
| ar = §$(T) — S(0) = 0, 
9 6«(COt 


so that 
‘ss if 
OS ) 
(5) | S(q(.9.t),J.t)dt == | p(J,9,t)— dt, 
0 t 0 Ot 


a formula that will be very useful for analysing cyclic processes. 


While the adiabatic invariance of the action variable J had long been a sub- 
ject of interest, not much attention was paid to the general behavior of the 
angle variable g. ‘Then in a rather strange reversal, quantum mechanics led 
to new considerations in classical mechanics. In 1984 Berry [1] pointed out 
that the standard description of adiabatic processes in quantum mechanics was 
incomplete, missing an additional term for the change of the quantum phase 
that explained (or at any rate, accounted for) mysterious phenomena like the 
“Aharonov-Bohm effect”. ‘This led Hannay [1] (as well as Berry [2]) to consider 
the analogous question for the classical case, involving the change of the angle 
variable g. 

From the equation for @ in (3’), we see that 


p df’ as 
o(T) — 90) = | v(J,t)dt + | 5, (4.9.0), Jt) dt. 
0 OJ Jo ot 
The first term, the “dynamical angle’, 1s what we’d expect from the fact that 
g = v(J), while the second term is a correction that 1s necessary because J 
changes with time. 
Suppose now that the solution y of Hamilton’s equations for some value J 
of J traverses a closed curve C for 0 <t < T. Then by (5) we have 


(T) - =f t)dt — ue / (J na 
Y YP = ’ oJ a: P YP, Or ’ 
0 


where g(7') really means g(y(T)), and v(J,t) really means v(J(y(t)), t), and 
so forth through the whole gamut. 


Epilogue 655 


We naturally might expect for the adiabatic case, where g is making many 
circuits as J varies slowly in time, that we can approximate the integrand in the 
second integral by its averaged value, 


dq Lf" dq 
ae mt en J, ’ =——l(J. ’ d 
(phn a p( P.t)—a y,t)dp 


T = 
—)(J,t) dt. 
J=J I (p at 


We won’t try to give any justification for this assumption here, instead adopting 
the physicist’s approach, laconically expressed in this passage from Hannay [1] 
concerning similar considerations for the adiabatic invariance of J: 


so that, in the adiabatic limit, 


: a 
o(T) — 90) = f v(J,A),da — a 





Although the adiabatic principle ... is well defined and widely re- 
alised physically (Landau and Lifschitz 1976), it appears (Arnold 1978) 
to be surprisingly difficult to elminate the mathematical loopholes... 
I shall take it for granted. 


‘The quanti 
; » A (Cys: [ al (J,t) dt 
pr Pee. Ne pene 


where the integral is sometimes written as [.(pdcq) or something similar, is 
called the Hannay angle, or the “geometric angle”, since it depends on the shape 
of the closed curve C. ‘Thus we have, in the adiabatic limit, 


T 
o(T)- 90) = f v(J,A)dA + AG(C) . 


ee ee” 
eee, 


dynamical angle 


geometric angle 


Note that although we have used the term “adiabatic limit”, our situation is re- 
ally different from that considered for the adiabatic invariance of J, as reflected 
in the fact that we have been working with A = t. There actually is no addi- 
tional parameter A involved here. ‘The interesting point is that even though @ 
is changing much faster than J, repeating its position over and over again, the 
variation in J can result in a cumulative mod 27 change of 9. 


‘To illustrate the Hannay angle, and partly to consider the evidence for our 
averaging assumption, we will consider two classical phenomena that are of- 
ten discussed in connection with the Hannay angle, even though their analysis 
doesn’t necessarily use the formula for it. 


656 Chapter 22 


The Hannay hoop. Consider a planar hoop C, along which a bead, of mass 
m = 1 for convenience, is sliding without friction, so that its velocity is constant, 


<t Row) 


and suppose that the entire hoop is being rotated very slowly in its own plane, 
1.e., about an axis perpendicular to the plane; at time ¢ the hoop will be rotated 
by the rotation Rey) through an angle of 6(¢), where 6’(t) is very small. 


@ We will begin by analysing this situation in the same spirit as the analysis 
of Rayleigh’s pendulum, using a hands-on elementary mechanics approach,! 
so that we can see what is actually happening (or what would be happening 
if we could actually realize this completely idealized situation). Looking at the 
plane of the hoop from above (a), suppose that O is the point of the plane 





about which we are rotating the hoop, P is the position of the bead at some 
time, and @ is the angle between OP and the tangent vector to the hoop at P. 
If the hoop is rotated (b) by a small angle A@, the position P on the hoop goes 
to a position A with AP very nearly perpendicular to OP and having length 
very close to rA@. Decomposing the displacement PA into PB perpendicular 
to the tangent of the curve and BA in the direction of the tangent, we have for 


their lengths 
PB =rA6 cosa, BA = rA@sina. 


The displacement PB causes a force that pushes the hoop perpendicularly 
against the bead (for a # 2/2), which keeps the bead moving along with the 
hoop [one of those implicit assumptions inherent in this idealized problem]. On 
the other hand, since the bead is sliding frictionlessly, the displacement PA in 
the direction along the hoop has no effect on the bead. Consequently, if s is 
the arclength along the hoop, measured from some fixed point, the bead falls 
behind the hoop by the amount As = —rA@ sina. 


| From Calkin [1; pp. 183-184]. 


Epilogue 657 


The bead is making many circuits around the hoop during the time that the 
hoop rotates by the small amount AQ, and for a hoop of length £ the averaged 
value of As is 


£ 
(As) = -(; r(s) sin a(s) ds) - AG, 
0 


where we take the liberty of writing = instead of ~ “in the adiabatic limit”. 
Now we want to use the formula 


£ 
(A) 24 = | r(s)sina(s) ds. 
0 


This can be seen geometrically by noting that for a small change és of s, the 
quantity r(s) sin a(s) ds is very close to twice the area of the triangle OPQ when 


Q Pp 

a(s 
r(s) 

O 


the arclength from P to Q is és. Amore formal derivation is given in Problem 2. 
Using (A), we now have 

(As) = —A/£)A6. 
If the slowly rotating hoop goes through a complete revolution, so that at the 
end of the process A@ = 27, we then get 


(As) = —41A/€. 


Thus the amount the bead ends up behind, averaged over all initial positions 
of the bead, is —477 A/€. 
This is often written as 


(Ag) = —82*A/€? for g= Fs 


(note that ¢ is not the polar coordinate angle around O, but the angle coordinate 
for the action-angle variables, which we will use later [page 660]). For a circular 
hoop we have (Ag) = —2z. This is obviously correct in the case where the axis 
of rotation 1s the center of the circle, since we then have aw = 7/2 at all times, 
and the bead isn’t being affected by the hoop at all. Conversely, if (Ag) = —2z, 
then the hoop must be circular, since we can write 


An A 
@) 
where the expression in parentheses is, by the isoperimetric problem, always 
positive for a non-circular hoop. 


Ag = —2nx + 2n(I — 


658 Chapter 22 


@ For a more detailed analytical treatment using Lagrange’s equations,'! we 


consider the hoop as our configuration space, with the discussion of the displace- 
ments on the bottom of page 656 now magically subsumed under the hypothesis 
that the bead is confined to move on the hoop. For a number So, let q(so) be 
[the vector from the origin O to] the point on the hoop whose distance from 
some fixed point 1s S9, where the distance is being measured by the arclength 
along the hoop. ‘Then q’(so) will be the unit tangent vector t(sq) to the hoop 
at that point. At time ¢ let the bead be at the point q(s(¢)) on the hoop for 
some function s. Then in the non-rotating coordinate system, it is at the point 
Q(t) = Roayq(s(t)), where we now regard Req) as a 3 X 3 matrix. 





If u is the unit vector at O perpendicular to the plane, then for any vector v 
in the plane we can write 


d ; 
(5 Rew (v) = O(t)u x Roay(v) = w(t) x Roy, say, 


so that the derivative of Q is given by 


d 

Qs() = =| Roma) | 
= Rog t(s()8 + w(t) x Roa (a(s(t))) 
= Row |t(s@)s + @(1) x a(s(s))]. 


The Lagrangian is just the kinetic energy, so 


E(S25 31) s|t(s)i + w(t) x q(s)|° 
= [t(s)s + w(t) x q(s)] + [t(s)s + w(t) x q(s)], 


where we use « to indicate inner products, since ( ) has been preémpted for 
averaged values. Omitting arguments of functions, as usual for working with 
Lagrangians, note that, with m = 1, the conjugate momentum p = v to S 1s 
OL 
Os 


| Cf. Marsden and Ratiu [1; §8.7]. 


(a) v=p t-(ts+@xq| 


Epilogue 659 


d (aL) _ ab 
dt\ 0s) as 


t-[ts + xq] = (ts + » xq]-[t’s + xt]. 


and Lagrange’s equation 


becomes 





dt 
Using t -t’ = 0 and the vanishing of several triple-product terms a + [b x c], we 
obtain 
S—|@m xq]-[o xt] + t-[o xq] =0, 
with the second and third terms being the centrifugal force and the Euler force. 
Since w = Ou, we get finally 


3 = 62q-t — mO|q|sina, 


where @ 1s the angle between q and t, as before. 
The integral form of the remainder in ‘Taylor’s formula gives 


s(t) = s(0) + s(O)t 
+ | (1 = 2)[8(2)?a(s(@)) + t((2) = B(e)la (s(x) sin a(s(z)) | de 


Now we’d like to average, replacing the quantities involving s in the integral 
by their averages around the hoop. Assuming that § and 6 are both small 
in comparison to the velocity of the bead, it appears that there is a theorem! 
showing that for large 7, like the time for the hoop to make one complete 
revolution, the quantity s(T) — s(0) — s(O)T is approximately 


T : 1 £ , ] £ 
2 
| (T — 1)|4(7) if q(s) + t(s) ds— da); | iq|(s) sin a(s) ds| at. 


Assuming this, we note that since q(s)et(s) = +(d /ds)|q(s)eq(s)], the integral 
of this term over the whole hoop vanishes, and the other integral, involving the 
Euler force, is 2A, by equation (A) on page 657 again, so that 


T 
s(T) ~ (0) +37 = | (T —1)6(t) dt 
£ Jo 


2A . An A 
= s(0) + s(O)T + POF — — using integration by parts. 


| Marsden and Ratiu [1] refer to Hale [1], though it seems one might have to read half 
the book to extract the result. 


660 Chapter 22 


For the special case where 6(0) = 0, we simply have 


s(T) = s(0) + s(O)T — si 


where s(0) + 5(0)T would be the total arclength of the hoop if the bead traveled 
with constant velocity s(0), with the correction term 


A 4nA 
s= ——, 
£ 


In the general case, we must find the average displacement over all initial 
positions, for a bead with constant velocity vo, say. Equation (a) gives 


vo = §(0) + A(s(0))q(s(0)) sin a(s(0)), 


sO we can write 


s(T) — s(0) — voT & —6(0)q(s(0)) sin a(s(0))T + —T — = 


and the first term on the right averages out over the hoop to cancel the second 
term, so the average shift, over all initial positions of the bead, 1s again —47A/€. 


@ Finally, we want to consider how this question fits in with the Hannay angle. 
Our manifold is now just the hoop C, and the Hamiltonian is 


\p|? 
H(q,p) = ee 


with the bead having constant velocity p = |p| with respect to C. The action, 
| oe 
defined by J = | p ds, 1s just 
20 C 


] I= : £ 
(1) a ee ck 
and on C we have 

2 
(2) p=—-s, 


where the fraction 27/£ is used because, by definition (page 621), g goes once 
around the invariant torus on the interval [0, 27]. 


Epilogue 661 


In computing the Hannay angle, instead of using ¢ on [0, 7], where T is the 
time for a complete rotation of the hoop, we will use the corresponding angle 6 
through which the hoop has been rotated for 0 < 6 < 27. We then have, with t 
denoting the unit tangent vector to C, 


0 27 0q 
Ag=-=— isis 


2m 1] 2 0 
a ( | oes y(J.9.8) de) de 
O 





at Jo «(NOx 00 
Oo LEE $e dq 

aoe 2 “““t(0)-—(J,0,0)dg\dé by (1 
ol (sf tO Gaede) de by) 


1 25 20 0 
=-;f (| t(8) + S4(J.9.0) dp) a6 


9) 275 £ 
= es | (/ t(0)- aq (J, 5,6) is) dé using the substitution (2) 
0 0 





06 


9) 20 £ 
=-F | (/ alsin) as) dé 


817A 
a 


by equation (A) on page 657 once again. 


Foucault’s pendulum revisited. At first sight, the classical Foucault pendulum 
would seem to be a perfect candidate for analysis in terms of the Hannay angle. 
After all, here we have the angle g of the pendulum varying rapidly while the 
whole system undergoes an extremely slow change due to the rotation of the 
earth. On the other hand, it is not the total change in ¢ that interests us, but 
the change of the angle @ of the plane in which the pendulum swings. We 
can analyse Foucault’s pendulum in terms of action-angle variables,! but the 
Foucault pendulum phenomenon will really turn out to be just a cousin of the 
Hannay angle phenomenon. 

The angle g ducks out of the picture as soon as we use the equations (*) on 
page 389, where we now indicate the latitude in that equation by @ instead of 1, 


| From Khein and Nelson [1]. 


662 Chapter 22 


and abbreviate the term w sin f by w: 


x” = -a*x + (20)y’ 


U) y” =—a’*y —(20)x". 


We note that these are precisely Lagrange’s equations for the Lagrangian 
L = 508? + 92) —50?(x? + 9?) + (xy — yd). 
Introducing polar coordinates (p, @) in the (x, y)-plane, 
x = pcos¢, y=psing, 
we express L in polar coordinates by 
ies “0 + pp?) — 509 + wp. 


The conjugate momenta are then 


OL e OL 2 e 
= — =A), = + @ ), 
Po a5 = er (p+) 
and the corresponding Hamiltonian H is 
2 2 
_ Poe P¢ |» WD 
(2) Pea pag) Ope te + w*)p*. 


Since @ is cyclic in L, the quantity pg is a constant. Note that H involves 
no parameter, and doesn’t even depend on time, and our variables are now p 
and @. 

We now look for the action variables, just as in the case of a central force on 


page 636, 
| 
Jp = 5 | Pode 
Y 
| 
Js=5- | podd = po 
Y 


Setting H = E in (2), we obtain 


ee: —pg” +2(E + wpe)p” — (a? + w”)p*. 


Epilogue 663 


In terms of 


2 
E+w 

r=’, Peay EO ; pce Oe Po 

az+ m2 a2 + m2 


1 b 
Ip = gave? +m? [ J-5 422-1 ar 
An r r 
Y 


This can be evaluated by contour integration, as in Problem 21-4, although now 
there are four real roots to worry about. The result is 


] lf E+ 
Jp = 3Vab +8 (b— Va) = 5 (FABRE — in), 


we can write 


2 a2 + a2 


where the ./a, involving Vv pg”, gives rise to the |pg| term.! Solving this for E 
gives 


3) H = E = (2Jy + Jol) Vo? + w? — Ign, 
expressing H in terms of the action-angle variables. 
For solutions with Jg negative we have o|Jg|/dJg = —1, so Hamilton’s equa- 
tion for @ 1s 
. OH aa 
aa oa at + OA —D =—W1, say, 
p 


so that for some constant C we have 
(4) b(t) = —@,t + C. 


For solutions with J¢ positive, we have o|Jg|/dJg = 1 and we get 


. OH 
@ = — = Va*?+0*-W =, say, 


OJ ¢ 
so that 
(9) o(t) = aot +C. 
| As Khein and Nelson [1], [2] point out, this absolute value sign will be essential in 


this case, but is almost always neglected (as in the calculations on page 636), although 
in most cases that doesn’t lead to any difficulties. 


664 Chapter 22 


Remembering that w = wsin? for the angular velocity w of the earth, we 
find that in both cases the total change in ¢ in one day’s time 1s 


20 20 21 
6(=)-0@=2= a+ mw ——-wsinl 
Ww Ww Ww 
2 
=+— a2 + m2 —2m sin £. 


The first term, depending on the sign of J¢, is the dynamical angle. The second 
term is the “Hannay angle”, 


Ad = —2x sin £, 


showing that the angular rate of change of @ is —w sin £ in both cases. 


To connect this perhaps somewhat confusing discussion with the Foucault 
pendulum, note that Foucault’s pendulum is just one of the solutions for this 
Hamiltonian, and for any solution the value of Jg will depend on the initial 
conditions, though we actually have to be a bit careful about what that means. 

In the case of the Foucault pendulum, we release the bob from rest, taking 
care not to impart any sidewise motion. But Jg for the Foucault pendulum is 
still positive (in the northern hemisphere), because the “rest” point from which 
we start it already has its own positive Jg. ‘To get Jg = 0, we have to give the 
bob an initial angular velocity @ = —@ (as reckoned in our [actually rotating] 
coordinate system). 

On the other hand, knowing the value of Jg for the Foucault pendulum, or 
even the sign of Jg, doesn’t really matter for our problem, since every value 
of Jy ends up giving the same Hannay angle, and even Jg = 0 gives the same 
Hannay angle, though its dynamical angle is 0. 


These considerations can be illuminated by the investigations of Onnes that 
were mentioned previously on page 391. All solutions to (1) starting at rest with 
initial condition p(0) = po can be written in terms of two normal modes. In 
both normal modes the pendulum bob traces out a circle of radius po, with 
mode 1 going clockwise, having Jg < 0, and mode 2 going counter-clockwise, 
having Jg > 0. Since every solution is a combination of the two normal modes, 
with the same Hannay angle, every solution has that same Hannay angle. Many 
more details, including the behavior of the p variable, and specific data for 
Foucault’s pendulum, can be found in Khein and Nelson [1]. 


Epilogue 665 


PROBLEMS 
1. For small oscillations we have cos @ = 1 — +6? and the pendulum equation, 
6” + (g/1)6 = 0, has the solution 

0 = Acosat, w? = g/l. 


(a) ‘he energy EF, the sum of the kinetic and potential energy, 1s 
i +mgl A? 
and the tension is 
T =mg- imgA? cos? wt + mgA’ sin’ wt. 
(b) The average (F) = (T) over a complete period is (@/ i , F. Using 
the fact that oe cos’ t dt = pe sin? t dt = x, show that 
(F) =mg+ img’. 
(c) Conclude that 
él §1— *.8£ SE = —1mgA?-51 Be ose 
—6l-(F) = —mg-dl—qmgA*-o£ = > OF =— ymgA*.ol => aaa &. 


2. Given a function r, consider the curve 6 + (r(@) cos 6, r(@) sin 8) (the graph 


of r in polar coordinates). Problem 9-5 shows that 


r(@) 
r'(0) 





tanga = 


(The a@ in Problem 9-5 is the supplement of the @ in this figure, and tan is the 
same for both.) 
For the region R bounded by the graph C of r, the area A is (as on page 630) 


l 20 
Az [ arnae = f adr? a6) = | $76 = al r(0)? dé. 
R R Cc : 


Using the substitution 


6 
5S | Vr(0)? + r/(0)? dé, ds = /r(@)? + r’/(0)? dé 
A 
show that for C of length € we have 


lp are d ‘ 
A= >| ca leading to TA | r(s) sina(s) ds. 
0 0 


r'(s) 2 
+ TO 





SUPPLEMENT 
A PDE PRIMER 


The material of this Supplement is basically the beginning of the first chapter 
of DG, Vol. 5. 

When we consider an ordinary differential equation u'(x) = f(x,u(x)), we 
find that there are solutions u with any desired value for u(xo), this dependence 
on the “initial condition” u(x) usually manifesting itself, if we explicitly solve 
the equation, by the presence of an arbitrary constant of integration. Equations 
of order n, on the other hand, will involve n constants of integration. 

When we solve a PDE, we usually obtain arbitrary functions in the answer. 
For example, to be as simple-minded about the thing as we can, we note that 
the equation 


Ou 
ay mre? = 0 
y 


has the solutions u(x, y) = A(x); the only restrictions on A are ones which fol- 
low from restrictions we might choose to place on u (e.g., that u be differentiable 
with respect to x). The equally stupid looking, but actually quite important, sec- 


ond order equation 
2 


O~u 
—_———. = O 
ax dy (x, y) 


leads to 
Ou 
= (x,y) = a(x), 
x 


and hence to 
u(x, y) = A(x) + B(y), A'(x) = a(x). 


Without belaboring the point any further, we simply note that when we look 
for precise theorems, we should expect the hypotheses to reflect the presence 
of these “arbitrary functions” in the same way that the precise theorem for 
ordinary differential equations reflects the presence of arbitrary constants. 

We wiul first consider equations that involve a function u on R” and only 
its first partial derivatives ux;. For simplicity of writing, and convenience of 
visualization, we will first deal exclusively with the case of R*, denoting a typical 
point of R* by (x, y) and adopting the standard notation 


ux = P, uy = q. 


667 


668 Supplement 


By a first order PDE we then mean an equation of the form 
F(x, y,u(x, y),Ux(x, y),Uy(x, y)) = 9, 
or, to use the standard abbreviated form, 
F(x, y,u, p,q) = 09. 


It will be convenient to denote the various partial derivatives of F by Fy, Fy, 
F,, Fy, and F,. Naturally, the function F: R? — R shouldn’t be too badly 
behaved; for example, it wouldn’t be very interesting if F were never 0. Just 
what hypotheses we really need will come out soon enough. ‘To begin with, 
we might imagine that F is differentiable and satisfies Fp, # 0 or Fy € 0, 
so that by the implicit function theorem we can solve for p in terms of q, or 
vice versa. Our main result is, that we can always completely reduce any first 
order PDE to a system of ordinary differential equations. This holds both in a 
“practical” and in a theoretical sense: We can actually write down a system of 
ordinary differential equations whose solutions, if we can find them, will give 
us the solution of our original problem; and the method by which this is done 
enables us to state and prove exact theorems. We will not deal at the very outset 
with the most general first order PDE, but will approach it in stages. 
We consider first the most general linear first order PDE 


(I) A(x, yux(x, y) + BO, yup (x, y) = C(x, y)u(x, y) + Dy). 
Usually this is simply written 
A(X, y)Ux + B(x, yy = CX, yu + Dx, y), 


with the arguments (x, y) appearing in A, B, C, and D just to emphasize that 
we are not considering an equation like A(x, y,u(x, y))ux +--:. 
Consider the vector ficld X on R? defined by 


0 4) 
X =A—+B—. 
(2) amor 
The value of X at (xo, yo) 1s 
0 0 
A(Xo, Yo) ae + B(Xo, yo) ao 
* I(x, yo) Y |(x0,¥o) 


using the standard identification of the tangent space R*(x,,y)) with R?, we can 
also write 
X (x0, yo) = (A(Xo, Yo), B(Xo, yo)). 


A PDE Primer 669 


We will call X the characteristic vector field of equation (1); the integral curves 
of this vector field are called the characteristic curves of equation (1). Thus 
C = (cC1,C2) 1s a characteristic curve if and only if 


dce\(t) _ 
dt 


dc2(t) 
dt 





(3) A(c(t)), = B(c(t)). 





We then have, for any C! function u: R* > R, 


EO) xl) + wy (ey EO 


= A(c(t)) -ux(c(t)) + B(c(t)) - uy (c(t). 


So any solution u of equation (1) satisfies 


(4) du(c(t)) = C(c(t)) -u(c(t)) + D(c(t)) for any characteristic 


dt curve C. 


For any fixed characteristic curve t +> c(t), equation (4) is an ordinary differcn- 
tial equation for the function u oc. Consequently, u oc is uniquely determined 


characteristic 
curve Cc 
(xo, Yo) through (x9, yo) 


= C(io) 





once u(c(to)) is specified. In other words, once we prescribe a value u(Xo, yo) 
for a solution u of equation (1), the solution u will then be completely determined 
along the characteristic curve c through (x9, yo). 

Now suppose we have any curve o which cuts a family of characteristic curves. 


If we arbitrarily specify the values of u at each point of o, then the solution u 
will be determined in a neighborhood of o. Moreover, we ought to be able 


670 Supplement 


to produce this solution u simply by solving equation (4) for each of the char- 
acteristic curves through each point of o. Of course, we clearly have to rule 
out the possibility that a portion of o itself is a characteristic curve, for then 
we could not arbitrarily specify the values of u along o. We even have to rule 
out the possibility that o is tangent to some integral curve c at some point 
(xo, yo) = C(to); for in this case, the directional derivative X(Xo, yo)(u) would 





be determined both by equation (4) and (in a possibly conflicting way) by the 
arbitrarily assigned values of u along o. We must thus assume that the vectors 


o'(s) = (o1'(s),02'(s)) and ~— (A(o(s)), B(o(s))) 
are always linearly independent. Equivalently, we must require that 


0 + det io eh — ¢1'(s) B(a(s)) —o9'(s)A(a(s)) 


for all s. In particular, o’(s) 4 (0,0) so o is an imbedding. Although we 
will later have a much more general result, we summarize this information in a 
theorem, in order to get all the details cleaned up before we carry the discussion 
any further. 


1. THEOREM. Let A, B, C, and D be C* functions defined in an open set 
U CR’, and let o: [a,b] ~ U be a one-one C* curve such that 


o1'(s)B(a(s)) 4 02'(s)A(a(s)) for all s € [a, b]. 


Let uw: [a,b] ~ R be a C¥ function. Then there is a C¥ function u, defined in 
a neighborhood V of o([a, b]), such that u satishes 


(1) A-ux+B-uy =C-u+D on V, 
with the initial condition 
u(a(s)) = u(s) for all s € [a, 5]. 


Moreover, any two functions u with this property agree on a neighborhood of 


a (la, bj). 


A PDE Primer 671 


PROOF. There is a C* map 
y: [a,b] x (-e,e) ~ U 
such that each curve 
tt> y(s,t) 
is a characteristic curve with 


y(s,0) = o(s). 


Clearly 


0 
=-(s, 0) = 0'(s) = (01'(s), 62'(s)) 


0 
=-(s,0) = (A(a(s)), B(o(s))). 


So, by the hypothesis on o, the Jacobian of y at (s,0) is always nonsingular; 
consequently, if ¢ is sufficiently small, then y is a C* diffeomorphism onto a 
neighborhood V of o([a, 5)). 

By choosing é¢ still smaller, if necessary, we can insure that for each s € [a, b] 
there is a C* function Bs: (—e,€) > R satisfying 


—— = C(y(s,t))- Bs(t) + D(y(s, £)) 
Bs(0) = u(s) 


[this is Just the equation (4) which should be satisfied by u oc along the integral 
curve tf +> y(s,t)]. We would actually like to know that Bs(t) is C K as a function 
of s and ¢; in other words, if we define B: [a, b] x (—e, ¢) > R by 


B(s,t) = Bs(t), 


then we would like to know that B is C*. To prove this, we must use the fact 
(see, e.g., DG, Vol. 1, Prob. 5-5) that we can solve the equation “depending on 
parameters” 


a(O,s,r)=r forr eR 
0 
a alt, S: r)=C(y(s,¢)) -a(t,5,r) + Diy, ¢)), 


for a C* function a@, so that 
B(s,t) = a(t, s, u(s)) 


is also CK. 


672 Supplement 


Now the solution u, if it exists, clearly must be the C* function 


u(x,y) = B(y"'(x,y)) or equivalently u(y(s,t)) = A(s, 2). 


‘To prove that u really is a solution, we note that through any point (x, y) € V 
there 1s a characteristic curve tf +> y(s,t), and that 


du(y(s,t)) _ dB(s,t) 
dt 


5. = CVS.) BUS.) + DOS.) 


= C(y(s,t)) -u(y(s, t)) + D(y(s, t)), 


while we also have 


du(y(s, t)) 


0 0 
a = ux(y5,1))-4(s,0) + uy (5,1) 22 66,0 


= ux(y(s,t))- A(y(s, t)) + uy (y(s, t)) > Blyv(s, £)), 


since tf +> y(s,f) 1s a characteristic curve. ¢% 


Notice that ‘Theorem | involves exactly the sort of “arbitrary function” that 
our general considerations would lead us to expect: in a neighborhood of the 
“initial curve” u(o(s)) = u(s). The only requirement is that 0 be nowhere 
tangent to a characteristic curve; we will express this by saying that o is free 
(sometimes the term “non-characteristic” is used, but this seems a little mislead- 
ing). In general, the problem of finding a solution of a PDE with an appropriate 
initial condition is called the “Cauchy problem” for this equation. ‘Thus we have 
solved the Cauchy problem for the linear PDE (1) for any initial condition along 
any free curve. In particular, we can solve the Gauchy problem along the x-axis 
o(s) = (s, 0) if the x-axis is free, which is equivalent to the condition that B 4 0 
along the x-axis. In this case we can use the given equation (1) to solve for uy 
in terms of ux along the x-axis: 


Cc D 
uy = Ble as iy sa 2B: 

If we were interested in the Gauchy problem only along the x-axis, then we 
could stmply demand this very natural condition in our hypotheses, and not 
mention the characteristic curves at all; but the characteristic curves are still 
the most important ingredient in the proof, and their generalizations will play 
decisive roles in all other equations we discuss. 

If our initial curve o actually happens to be a characteristic curve (thus failing 
in the worst possible way to be free), then we will be unable to solve the Cauchy 


A PDE Primer 673 


problem, and this inability will be manifested in the worst possible way: the 
possible initial condition along o is almost uniquely determined—it is deter- 
mined by the value at only one point, by the equation (4). On the other hand, 
if we are given an initial condition u along o which does satisfy (4), then there 
will be infinitely many solutions uv with this initial condition; for we can con- 
sider any free curve p with p(0) = o(0), and choose any initial data ¢ along p 


O 


p(0) = o(0) 


p 


with (0) = u(0). Thus, the characteristic curves are the places where different 
solutions agree. 

From Theorem | we can see immediately that an arbitrary linear first order 
PDE has, in common with the simple-minded equation du/dy = 0, a property 
which sharply distinguishes it from an ordinary differential equation 


u'(x) = f(x,u(x)). 


For the ordinary differential equation, any solution u will clearly be at least one 
time more diflerentiable than f is, and if f is analytic, the solution will also 
be analytic (cf. DG, Vol. 1, Prob. 6-9). But there are solutions of the equation 
in ‘Theorem | which are only Cc! (1 < / < o&) even when A, B, C, D are ck 
(| <k <a). For we may choose o to be a C* curve and # to be a function which 
is C!, but not C/t+!; then the solution u cannot be C!*!, since its restriction to 
the C* curve o is not C'T?. 
We next consider the most general quasi-linear first order PDE 


A(x, y, u(x, y))ux (x, y) 5 B(x, y,u(x, y))uy (x, y) — C(x, y, u(x, y)), 
or, more briefly, 
A(x, y,u)jux + B(x, y,u)uy = C(x, y,u). 


The functions A, B, and C are now defined on R?, and we consider the vector 
field X in R° defined by 


9 8 4 
Sa em Aine 
(2) te oy 02 


674 Supplement 


This vector field will be called the characteristic vector field of equation (1); 


the integral curves of X are called the characteristic curves equation (1). Thus 
Cc = (cC1,C2,C3) is a characteristic curve if and only if 


dcj(t) dc2(t) dc3(t) 

8) 2 = AC), SZ = Bem), SX =cle(). 
The slight discrepancy between this terminology and that adopted in the linear 
case 1s easily explained. Notice that if A and B depend only on x and y, then all 
characteristic vectors X(Xo, yo, Zo) have the same projection on the (x, y)-plane, 
namely (A(Xxo0, yo), B(xo, yo)). So the characteristic curves of a linear equation 
are really the projections on the (x, y)-plane of the characteristic curves in R?. 

For the quasi-linear PDE (1), the characteristic curves in R° have the follow- 
ing significance. Any C! function uw: R* — R determines a surface My, = 


{(x, y, u(x, y))} C R?, and the vector 











(ux (x, y), Uy (x, y), —1) 


is normal to M, at (x, y,u(x, y)). Equation (1) is therefore equivalent to saying 
that X(x, y,u(x, y)) lies in the tangent space of M, at (x, y,u(x, y)). So the 


My, 


characteristic vectors at the various points of M, give a vector field on My. 
Thus M, is the union of integral curves of this vector field; that 1s, My 1s 
the union of characteristic curves. If we are given an arbitrary initial condi- 
tion uv along an initial curve o in R*, then we ought to be able to construct 
a solution u passing through the curve s +> (0)(s),02(s),u(s)) in R? sim- 
ply by taking the union of the characteristic curves through all the points of 
this curve. We will clearly have to require that the vectors (0;’(s), o2'(s)) and 
(A(o1(s), 02(s), u(s)), B(o1(s), 02(s), U(s))) are linearly independent for all s. 


Me 


A PDE Primer 675 


9. THEOREM. Let A, B, and C be C¥ functions defined in an open set 
U Cc R?. Let a: [a,b] — R? be a one-one C¥ function, and w: [a,b] > R 
a C* function such that (01(s),02(s),u(s)) € U for all s € [a,d]. Suppose 
moreover that 


o1'(s)- B(o1(s), o2(s), u(s)) A 02'(s)- A(o1(s), o2(s), U(s)) for all s € [a, b]. 


Then there is a C¥ function u, defined in a neighborhood V of o([a, b]), which 
satisfies the equation 


(1) A(x, y,u)ux + B(x, y,u)uy = C(x, y,u) on V, 
with the initial condition 
u(a(s)) = u(s) for all s € [a,b]. 


Moreover, any two functions u with this property agree on a neighborhood of 


a({a, b]). 


PROOF. Now there is a C* function a = (@1, 2,3) with 
a(O,s,r) =r for r € R? 


Sanlt.s, r) = A(@(t,s,r)) 


Sarlt,s, r) = B(att,s,r)) 


“as(t.s, r) = C(a(t,s,r)). 


Let 
B(s, t) noes a(t, s, 0} (s), 02(S), u(s)), 


so that B is also C*. In particular, 


B(s,0) = (o1(s), o2(s), u(s)) 


= @, for short 


[so for each s, the curve t +> B(S,t) is a characteristic curve through e]. If we 


define 
y(s,t) = (Bi(s, t), Bo(s,t)) € R?, 


676 Supplement 


then the Jacobian of y at (s, 0) is 


0B; 0B gy O01 
a, 8:9) “a, OS 0) 7 o1'(s) Ay (0,5, ¢) 
OB2 OB2 / daz 
wo a ae 


o's) A(e) 
= b 
oo a ve) 


assumed to be nonsingular. So if ¢ is sufficiently small, y: [a, b] x (—e,e) > R? 
is a C* diffeomorphism onto a neighborhood V of o({a, d)). 
The solution u, if it exists, clearly must be the C* function 


u(x, y) = Bs(y"*(x,y)) or equivalently u(y(s,t)) = Ba(s, 2). 


To prove that u is a solution, we note that for any point (x, y) € V, there is a 
characteristic curve t > B(s,t) through (x, y, u(x, y)), and that 


du(y(s,t)) _ ds(s,t) 


- = C(B(s.1)) by (), 


while we also have 


du(y(s, d d 
SHO) = wxly(s,t))- 26,1) + uy.) 22(6,0) 


0 0 
= ux(H(s.1))- 45,1) + w (V5.0) E65, 


by definition of y 
= ux(y(s,t))- A(B(S, t)) + uy (VS, 0) > B(B(s,t)) by (*). 


We will say that the initial curve o is free for the initial condition 1 when it 
satisfies 


o1'(s) - B(o1(s), 02(s), u(s)) # o2'(s) - A(o1(s), o2(s), U(s)). 


Thus we can solve the Cauchy problem for a quasi-linear PDE (1) for any initial 
condition along any curve which is free for this initial condition. (In the linear 
case things are simpler, since the condition that o be free doesn’t depend on the 
initial condition 7.) 

The worst way in which the initial curve o: [a, b] + R? can fail to be free for 
the initial condition u is when the vectors o’(s) = (04'(s),02'(s)) and the vec- 
tors (A(o1(s), o2(s), u(s)), B(oy(s), o2(s), U(s))) = (A(e), B(e)) are everywhere 


A PDE Primer 677 


linearly dependent. In this case, it is customary to say that o is characteristic 
for u; this does not mean that o is a characteristic curve (indeed, o isn’t even 
a curve in R). If we assume that o is an imbedding, then o is characteristic 
if and only if (A(e), B(e)) is always a multiple of the tangent vector o’(s); by 
reparameterizing o we can then arrange that 


(A(e), B(e)) = o(s). 
Then if w is to be the initial condition for a solution u of (1) we must have 


C() = 01'(s) -ux(a(s)) + o2'(s) «Uy (a(s)) 
= £ w(o(s)) = =). 


This shows that the reparameterized curve s > (0;(s), 02(s), u(s)) must be 
a characteristic curve; equivalently, the original curve s +> (01(s),02(s), u(s)) 
must be a characteristic curve up to reparameterization in order for the Gauchy 
problem to be solvable when o is characteristic for u. If our initial condition u 
does have this property, then there will be infinitely many solutions u with this 
initial condition along o. The characteristic curves in R? are the places where 
the graphs of different solutions intersect; the projections of the characteristic 
curves onto R? are the places where different solutions agree. 

It should be clear once again that a quasi-linear first order PDE has solutions 
which are less differentiable than its coefficients. 

We are now ready to consider the most general first order PDE 


(1) F(x, y,u, p,q) = F(x, y,u(x, y),ux(x, y), uy(x, y)) = 0. 


This equation can also be reduced to a system of ordinary differential equations, 
but in this case the system will involve five functions; the geometric analysis will 
be correspondingly more complicated. 

At each point (xo, yo, Zo) € R?, we can consider the set of all vectors (a, b, —1) 
with 

F (xo, yo, Zo, a, b) = 0, 

and the corresponding family ¥ (Xo, yo, Zo) of planes perpendicular to such 
vectors. If u is a solution of (1), and M,, is the surface My = {(x, y, u(x, y))}, 


x 


678 Supplement 


then the tangent space of My, at (xo, yo, u(Xo, yo)) 1s a member of the family 
F (xo, Yo, U(Xo, yo)). In order to describe this situation more geometrically, we 
would like to have a more geometric way of describing the families ¥ (x0, yo, Zo). 
Now the relation 

F (Xo, Yo, Z0,a,b) = 0 


is one equation in the two unknowns, a and 5, so F (Xo, yo, Zo) ought to be a 
one-parameter family of planes; this suggests that there is a cone K(x9, yo, Zo), 
having its vertex at (xo, Yo, Zo), such that a plane P is in F (Xo, yo, Zo) if and only 
if P is tangent to K(Xo, yo, Zo) along a generator of this cone. If we consider a 





quasi-linear equation 
F(x, y,u, p,q) = A(x, y,u)- p+ B(x, y,u)-q—C(x, y,u) = 0, 


we immediately see that this is not always so. For in this case, the family 
F (x0, Yo, Zo) consists of planes perpendicular to vectors (a,b, —1) with 


a- A(xo, yo, Z0) + 5- B(xXo, yo, Z0) = C(Xo, yo, Zo). 
These planes all contain the characteristic vector 
(A(x, yo, Zo), B(Xo, Yo, Zo), C(Xo, yo, Zo))- 


Thus our “cone” degenerates into a straight line through (Xo, yo, Zo), pointing 
in the direction of the characteristic vector at that point. Clearly things might 


Vd characteristic vector 
ae 


be even messier if the analytic properties of the function F are sufficiently nasty. 


A PDE Primer 679 


Despite these difficulties, we can obtain a great deal of geometric motivation 
by temporarily pretending that each family ¥ (Xo, yo, Zo) 7s determined by a 
cone K(x9, yo, Zo), which happens to degenerate to a straight line in the case 
of a quasi-linear equation. This semi-mythical cone is called the Monge cone 
at (xo, Yo, Zo). Having accepted this fiction, we can now imagine a field of 
cones in R?; a C! function wu: R* — R is a solution of equation (1) if and only 


wy 


NG <2 
NO =D 
NO =< 


if the corresponding surface M, = {(x, y,u(x, y))} 1s tangent to the Monge 
cone K(xo0, yo, U(Xo, yo)) at each point (x0, yo, U(Xo, yo)). This gives us a field 
of directions at each point of M,, namely the direction which lies along a 
generator of the Monge cone at that point. The integral manifolds of this 
field of directions could be called the “characteristic curves of the solution u”. 
This definition is easily seen to be compatible with the one already given in 
the quasi-linear case, where the Monge cones degenerate to straight lines: for 
these straight lines must be the field of directions for any solution u, and the 
“characteristic curves of the solution u” are simply those characteristic curves of 
the quasi-linear equation which happen to lie on M,. But in the general case, we 
cannot write R? as a disjoint union of curves in such a way that each M,, is the 
union of a certain subset of these curves; we cannot describe the “characteristic 
curves of a solution uv” at all until we already know u. This might make the 
concept seem rather useless, but the requisite supplementary considerations will 
appear quite naturally when we seek an analytic description of these geometric 
pictures. 

How would we go about finding an analytic description of the Monge cone? 
Addendum 8B [and especially the material in DG, Vol. 3, Chap. 3, Addendum 
mentioned at the end] suggests that the Monge cone K(Xo0, yo, Zo) should be the 
envelope of the family of planes F (xo, yo, Zo); geometrically, the generators of 
K (Xo, Yo, Zo) should be the limits of the intersections of two planes of the family 
F (Xo, Yo, Zo), the limit being formed as the two planes approach each other. 
Until we explicitly say the opposite, everything we now do will be based on the 
assumption that these limits really exist; the ensuing discussion is consequently 
merely a route to discovery, and does not purport to prove anything. 

Let us assume for the moment that the equation 


F (x0, Yo, Zo,a,6) = 0 


680 Supplement 


can be solved for b in terms of a. In other words, assume there is a function @ 
with 


(1) F (xo, yo, Zo, a, 0(a)) = 0. 
One plane of the family ¥ (xo, yo, Zo) may be described by the equation 
Z— Zo = a(x — Xo) + G(a)(y — Yo). 
A nearby plane may be described by the equation 
Z—Zo = (a+ h)(x —X0) + (4 + A)(y — yo). 
The points (x, y, z) in the intersection then satisfy 
0 = A(x — xo) + [fa +h) — (@)|(y — Yo), 


and hence ; 
0 = (x — x0) + fee (y — yo). 


Therefore points in the limiting intersection ought to satisfy 
Gi Z— Zo = a(x — xo) + b(a)(y — yo) 
0 = (x — x0) + ¢'(a)(y — yo). 


On the other hand, equation (1) shows that 


d 
0 ae Fat yo, Z0,4, P(a)) 
a 


= Fp(Xo, yo, Zo,a, P(a)) se d (a) ° Fy (Xo, Y0, 40,4, p(a)), 
and hence 


F 
(iii) (a) = p(Xo, Yo: 20,4, 9(a)) 
Fg (Xo, Yo: Zo, 4, P(a)) 
From (11) and (iii) we find that the points (x, y,z) on the Monge cone K (X90, yo, Zo) 
should satisfy 


where a and b are 


ee eee 0) numbers such that: 
(iv) F (x0, Yo, Z0,4,6) = 0 


X—X0 JY yo 


[F> and Fy evaluated at (Xo, yo, Zo, a,b). 
Fp Fg 


A PDE Primer 681 


Now consider a solution u of (1), and let 
Zo = u(Xo, Yo), Po = Ux(Xo, Yo), qo = Uuy(Xo, yo). 
The tangent plane of M, at (Xo, yo, Zo) consists of points (x, y, Z) satisfying 
Z— Zo = po(x — Xo) + go(y — yo). 


Equations (iv) show that points (x, y,z) which are on both this tangent plane 
and the Monge cone K(xXo0, yo, Zo) ought to satisfy 
(v) fe ne Os piace A | 
Fp Fg Polp + qoFg 
[Fp and Fg evaluated at (xo, yo, Zo, Po. Jo)]. 








Therefore, these points ought to lie along the line through (x9, yo, Zo) with 
direction 


(Fp, Fg, Poy + GoFg) [Fp and Fy evaluated at (Xo, yo, Zo, Po; Go)]- 


We have finally reached the stage where we can make a perfectly sensible 
definition, involving no assumptions at all. Let u be a solution of (1), and for a 
point (xo, yo), define Zo, po, and go as before. We then define the characteristic 
vector of u at (Xo, Yo) to be the vector 


where Fp, and Fg are to be evaluated at (x0, yo, Zo, Po. Jo); this vector is to be 
considered as an element of R*(xo,y9,z,)- If Mu = {(x, y, u(x, y))}, then the 
tangent plane of M,, at (Xo, yo, Zo) 18s perpendicular to the vector (po, go, —1). 
The vector X(u; Xo, yo) clearly has this property, so every characteristic vector 
of u is tangent to M,, and the set of all characteristic vectors of u forms a vector 
field on M,. The integral curves of this vector field are called the characteristic 
curves of the solution u, and they are clearly curves on My. 
A characteristic curve c of u is thus a curve in R? satisfying the equations 











dc, (t) _ 
dt Fy(®) 
dc2(t) _ 
(3) ae oe 
a) ~ Ux (Cy (t), c2(t)) . Fy(e) + Uy (Cy (t), c2(t)) ; F,(e) 


where @ = (cy (t), c2(t), c3(t), Ux (c1 (t),c2(t)), Uy (C1 (t), c2(t))). 


682 Supplement 


Now if we assume that u is C?, then we can also obtain equations for the partials 
ux(ci(t), c2(t)) and uy(ci(t), c2(t)). For equations (3) allow us to write 


se me é ee (t), c2(t)) aa 


(4) = Uxx(C1(t), C2(t)) Fp(@) + Uxy (C1 (t), c2(t)) Fa (@) 


— = uyx(cr(t), C2(t)) Fp (0) + uyy(Cr(t), c2(t)) Fa(2), 


On the other hand, since uw satisfies 








= Uxx(C1(t), C2(t)) 


F(x, y, u(x, y),Ux(X, y), Uy(X, y)) =F O, 
we also have 
Fy + uy Fy + Uxx Fp + Uyx Fg = 0 


s) 


where all partials of F are evaluated at (x, y,u(x, y),ux(x, y),Uy(x, y)). Thus 
equations (4) become 


dt 7 


dt 7 


— Fy) — ux (ci (t), c2(¢)) + Fule) 

(6) 

— Fy (@) — uy (cr (t), €2(¢)) » Fue). 

Let us now define a curve I in R® by 

(7) P(t) = (ci(¢), co(t), c3(t), ux (cr (t), C2(t)), Uy (C1 (4), C2(t))). 

Then equations (3) and (6) may be written 

dV, (t) 
dt 

AY (t) 
dt 

dV3(t) 
dt 

aV4(t) 
dt 


dV's(t) 
dt 





= Fp(T()) 





= F(T (t)) 


= Pat): Fo) + Vs@)- Fg) 


= —F,(T(t)) —Ta(t)- F(T) 





= —F,(T(t)) —Ts(t)- F(T). 


A PDE Primer 683 


Now although the curve I’ was defined in terms of a solution u, the final 
equations (8) involve only the original equation (1). This will allow us to define 
geometrically meaningful objects which do not depend on knowing a solution uw. 
We may regard a point (xo, yo, Z0,4,b) € R° as a plane in the tangent space 
R°(xo,¥o,z9)» Namely, as the plane perpendicular to the vector (a,b,—1). A 
curve T in R° may then be regarded as a family of planes, the plane at time ¢ 
being in the tangent space of R? at c(t) = (T(t), P2(¢), P3(2)); it will be conve- 
nient to refer to this curve c as the base curve of I. An arbitrary curve I is called 


a strip if the tangent vector c’(t) of the base curve c always lies in the plane 
determined by I at time ¢t. This means that 


c(t) = (T1'(t), T2‘(t), 1 3"(t)) sis perpendicular to (M4(t), Ps(t), -1). 
So I is a strip if and only if it satisfies the strip condition: 


9) 2) <n 4 QO 


Notice that any solution of (8) is automatically a strip. A curve I will be called 
a characteristic strip of the PDE (1) if I’ satisfies (8) and also 


(10) F(P(t)) = FU), Pot), P30), Vat), Vs (2) = 0. 


This last restriction is not as stringent as it might first seem, for if T° satisfies (8), 
then 





dV y(t) 


(11) < F(r(O) =F ised te pO 


dt ot 
[all partials of F evaluated at I'(t)] 
=F Fop t+ By Fo t+ Fe- (Ua) Fp + Us(t) Fo) 
a Fp ° (— Fy = I"4(t) Fz) ar Fg ° (—Fy _ I'5(t) Fz) 
= 0. 


So if I’ satisfies (8) and also satisfies (10) for one t, then it satisfies (10) for all t, 
and 1s consequently a characteristic strip. 


684 Supplement 


Now how are characteristic strips related to solutions? We have seen that 
if u is a solution of (1), then M, is the union of certain characteristic curves 
[solutions of (3)]. Moreover, if c is a characteristic curve, then the set of tangent 
planes of M,, along c gives the curve [° of equation (7), which is a characteristic 
strip. So M, is the union of base curves of characteristic strips. 

Now suppose that we have an arbitrary curve X in R°, with base curve o, and 
that F(X(s)) = 0 for all s. ‘There is a unique solution of (8) through each point 
u(s), and by the remark after equation (11), this solution is a characteristic 
strip. We thus obtain a family of characteristic strips T. The union of the 





corresponding base curves c is a surface M,, containing the base curve o. Is 
it reasonable to suppose now that u 1s a solution of (1)? The answer is no, for 
there is clearly no hope unless & zs also a strip. When this condition is satisfied, 
then everything works out. We will prove that if o: [a,b] > R? is a given 

ie) ° . . Oo oO . 
curve, u: [a,b] > R is a given function, and p,q: [a,b] — R are two functions 
satisfying 


(a) F(X(s)) = F(o1(s), 02(s), u(s), p(s), q(s)) = 9, 
and the strip condition 

du(s) 5 .doi(s) — 9, .doa(s) 
(b) reals he ae a a iO) ie 


then there is a unique solution u of (1) satisfying 
u(a(s)) =u(s), ux(a(s)) = p(s), uy(a(s)) = G(s) 


[naturally, (b) is a necessary consequence of these equations]. We will clearly 
have to assume that o’(s) is linearly independent of the vector obtained by pro- 
jecting the characteristic vector (2) on the (x, y)-plane. In other words, we will 
have to require that o’(s) and (Fp(X(s)), Fg(X(s))) are linearly independent, 
or that 


(c) o1'(s) + Fg(2(s)) 4 o2'(s) > Fp(2(s)). 


A PDE Primer 685 


Before we proceed to prove the theorem, we should insert a remark about the 
hypotheses, which will involve o, u, p, and q satisfying (a)—(c). At first sight, we 
seem to be contradicting our basic philosophy about first order equations, for 
we scem to be saying that we can arbitrarily specify not only the values u of u 
along o, but also the values p and q of ux and uy along o. This is not really the 
case, for p and g are practically determined by the equations (a) and (b) which 
they must satisfy. This is most apparent when our initial curve o is the x-axis, 
o(s) = (s,0). Then equation (b) already determines p. Moreover, condition (c) 
says that F, 4 0 along {(s, 0, u(s), p(s), g(s))}, so the implicit function theorem 
shows that equation (a) can be solved for qg(s) in terms of p(s)—there is a 
function ¢@ with 

F(s,0,u(s), p(s), P(p(s))) = 0. 

Of course, there may be several possible ¢, but once q(0) is determined, there 
will be only one continuous choice of g satisfying (a). [In the quasi-linear 
case, G(s) will actually be uniquely determined.] It is not hard to see that a 
similar situation prevails when o is any curve satisfying (c): we are essentially 
specifying only the values u of u along o, and then making certain that we 
have a continuous choice of the limited possibilities for p and g. In order to 
emphasize this point we will refer to (u, P.q) as “initial data”, rather than as 
initial conditions. 


3. THEOREM. Let F bea function of class C ae k > 3, defined in an open set 
U CR°. Leta: [a,b] > R? be a one-one C*~! function, and let %, p,q: [a,b] 
— R be C*~! functions such that for all s € [a, b] we have 


(a) — U(s) = (a1(s), o2(s), u(s), p(s), g(s)) €U and F(X(s)) = 0, 





du(s) do ;(s) sls 
(b) a a = p(s)—— Poel q(s) 
(c) 01'(s) + Fq(X(s)) F o2'(s) on 


Then there is a C*~! function u, defined in a neighborhood V of a({a, b]), 
which satisfies the equation 


F(x, y, u(x, y),ux(X, y), Uy(X, y)) = 0 on V 
and also 
u(o(s)) =u(s), ux(o(s)) = p(s), uy(o(s)) = G(s), for s € |a, DJ. 


Moreover, any two functions u with this property agree on a neighborhood of 


a(|a, b)). 


686 Supplement 
PROOF. As in the proof of Theorems | and 2, we use DG, Vol. 1, Prob. 5-5 to 
conclude that there is a C*~! function a = (y,..., as) with 

a(0O,s,r)=r for r € R° 


0 
a ole, s: r) = Fy (a(t, s, r)) 


0 
ay 22lE. S, r)= Fa(a(t,s, r)) 


“as(t.s, r) = ag(t,s,r)- Fp(a(t,s,r)) + as(t,s,r)- Fg(a(t,s,r)) 


S aalt.s.1) = —Fy(a(t,s,r)) —a4(t,5,r)- Fu(a(t,s,r)) 


S ast. r) = —Fy(@(t,s,r)) —as(t,s,r)- Fu(a(t,s,r)). 


Let 
B(s,t) = a(t, s, 01(s), o2(s), u(s), P(s), G(s), 


so that B is also C*~!. In particular, 
B(s, 0) = (o1(s), o2(s), u(s), P(s), G(s) = B(s). 


If we define 
y(s,t) = (Bi(s,t), B2(s,t)) € R’, 


then the Jacobian of y at (s, 0) 1s 


a a d 
Piso) FX(s,0) _ (Ho ~~ (0,8, £(s)) 
4 y d 
Piso) Pisa) Loris) 0,5, 26) 


_ = pm 


o2'(s) Fg (X(s)) a 


assumed nonsingular. So if ¢ is sufficiently small, then y: [a,b] x (—e, €) > R? 
is a C*—! diffeomorphism onto a neighborhood V of o({a, b)). 
The solution u, if it exists, clearly must be the C K-1 function 


u(x, y) = B3(y (x, y)) or equivalently u(y(s,t)) = B3(s, ft). 


A PDE Primer 687 


We claim that 
ux(y(s,t)) = Bals,t) and — uy(y(s,t)) = Bs(s, ¢). 
This will prove that 
F(x, y, u(x, y),Ux(x, ¥), Uy (x, y)) = 9; 


for we have already seen (equation 11) that F(@(t,s,r)) is constant for fixed s 
and r, while F(a(0,s5, &(s))) = 0 by (a), so that we will have 


0= F(a(t,s, X(s))) 
= F(B(s,t)) = F(Bi(s,t), Pals, t), B3(s, t), Bals, t), Bs(s, ¢)) 
= F(y(s,t), u(y(s,t)), ux (vs, t)), Uy (VS, £))). 
To prove the claim, we consider the function 


si si dP 2 





A = — — fp4-— — fs- 
Os” 
We have 
a,0) = HO _ g6). AO _ 45), 220 
=O) by (b). 
Moreover, 
dA 0°B3 84 981 ABs Br 0° Bi O° Bo 











i OOF OE Os a Oe! Onde eS oso 


Bs, 981, AB 
-2(F a or oat 


9B 961 , OBs 2B. Ba Br Bs Bo 


ds Ot | Os Ot Of Os dt Os 
0B 4 dB s dB 


=0+ F,- erm be a. + (Fx + FuBa) 


by (+) [where all partials of F are evaluated at B(s,t)] 
i: 2, , Bs, oBa , , oBs 


=~ + (F; + Fas) 


= fe F ae fy, + — F,-— F 
Age oh ae ee ae eR a ea 

_ pp. (283 _ 2, 2B1 _ 9. 8B2 

“Nog Os as 


=< (F(B,N) ~ Fu 
S 
= —Ff,-A, 


688 Supplement 


since we have already seen that F(B(s,t)) = 0. Now for each fixed s, we have 
an ordinary differential equation 


OA 
Steen AE, 
Or _ 


with the initial condition 
A(s,0) = 0, 


so the unique solution is A(s,t) = 0. In other words, we have shown that 


) 0 ) 
ai meee se Be = 
Also 9B 9B 98 
ape pee bela 
ap Le op eee ae 
On the other hand, differentiating the definition u(y(s, t)) = B3(s, t) gives 
) 
F< u.y(s,0) 2 + wiv,t) 2 
s as. 
op3 = dB2 
> = uy (¥(8,1)) St + uy(V(s,)- 


These last four equations give two solutions for two linear equations in two 
unknowns, whose determinant 


a1 Bs 
rer Os Os 
aB1 ABs 
Ot Ot 


is ~ O for (s,t) € [a,b] x (—e, €). So the two solutions must be the same, L.e., 
Ux(y(s,t)) = Ba(s,t) and — uy(y(s,t)) = Bs(s,¢), 


as desired. «€ 


We will say that the initial curve o is free for the initial data u, p,q when 
condition (c) in Theorem 3 is satisfied. ‘Thus we can solve the Gauchy problem 
for a first order PDE (1) for any iitial strip & = (04, 02, U, D, q) for which the 
initial curve o is free for the initial data u, p,q. 


A PDE Primer 689 


Again we consider the case where our initial curve o fails to be free for 
the initial data uw, Pg in the worst possible way, namely when o’(s) and 
(Fp(X(s)), Fg(2(s))) are everywhere linearly dependent. Once again we say 
that o is characteristic for u, p,q. Assuming that o is an imbedding, we can 
reparameterize o so that o’(s) = (Fp(X(s)), Fy(2(s))). This gives us the first 
two equations in (8) for the curve (01,02,U, p,q). The third equation of (8) is 
just the strip condition (b). The argument on page 682 shows that these three 
equations imply the last two if there is a solution u of (1) with 


u(a(s)) =u(s), ux(o(s)) = p(s), uy(o(s)) = 46s). 


So when o is characteristic, the Cauchy problem is solvable for the initial data 
U, P.q along o only if (01, 02, U, P, g) is a characteristic strip. When this is the 
case, there will be infinitely many solutions with this initial data along o. The 
base curves of characteristic strips are the intersection curves of the graphs of 
different solutions meeting tangentially. 

We can now describe the situation for first order PDE’s in n variables very 
easily, without bothering to write down all the results as formal theorems. Con- 
sider first the quasi-linear PDE 


n 
So Ai (x1... 6 Xn U) Ux; —— a OF et eee ot a 
i=] 


The characteristic vector field of this equation is the vector field X in R"*! 
defined by 


i 


i 0 0 
eet as dee 


the integral curves of X are the characteristic curves of the equation. As in the 
case n = 2, it 1s clear that if u is a solution of (1), then the hypersurface 


a3 n+1 
My = {(%1,..., Xn, U(X,...,Xn))} CR 
is a union of characteristic curves. Now suppose we are given a one-one map 
. n 
0: D> R’, 


where D C R"~! is a compact (n — 1)-dimensional manifold-with-boundary, 
and a function 7: 0 — R. We can produce a solution u of (1) with 


u(a(s)) = u(s) for alls Ee D 


690 Supplement 


by taking the union of the characteristic curves through all points (o(s), u(s)) € 
R"t1!. The proof is exactly analogous to the proof of Theorem 2, except that 
we will now require that the matrix 


Dj 0;(s) Dn—10\(s) Ay (o(s), u(s)) 


Dieee D4e30) AGO Ae) 


be nonsingular for all s € M. This means, first of all, that the matrix (Dj;o;(s)) 
must have rank n — 1, so that o 1s an imbedding and o() C R” 1s a hypersur- 
face. In addition, the vector (A;(a(s), u(s)),..., An(a(s), u(s))) must not lie in 
the tangent space of o(D); we express this by saying that the “initial manifold” 
o(Q) is free for the initial condition u (for linear equations the initial condition u 
is irrelevant). Thus we can solve the Cauchy problem for any initial condition 
along an initial (n — 1)-manifold which is free for the initial condition. 
Now we consider the general first order PDE 


F(x1,...,Xn,U(Xq,..., Xn), Ux, (X1,---, Xn), 6 Ux, (X1,---,Xn)) = 0. 
We denote the partials of F by 
Fy, , Fy, Fy : 


Consider curves T in R?"*? satisfying 





ae =F,(T(t)) i=1,...,n 
dT, : 
Se DePat FCO) 
eel =-FxPO)—-TaiOR(O)  i=1..50. 


As before, we easily check that if I satisfies these equations, then F(I(t)) 1s 
constant in ¢. A solution [ with F(I(t)) = 0 for all ¢ 1s called a characteristic 
strip. Now suppose we have a one-one map 


ao: D > R" 
with © C R”~! as before, and functions 


U, P1,.--,Pn. DOR 


A PDE Primer 691 


with 
F(X(s)) = F(o,(s),...,0n(s), U(s), Pi(s),..., Pn(s)) =0 for all s € D. 


Then there is a unique characteristic strip [: through each point &(s), and the 
union of the corresponding base curves 1s a hypersurface M,. In order for the 
function u to be a solution to our PDE we will need two conditions, which 
allow us to extend the proof of Theorem 3 essentially without change. First, 
the matrix 


Dj 0;(s) Dn—-101(S) Fp, CX(8)) 
DG Dine: FASO) 


must be nonsingular. Thus 0(2) C R” must be an (n — 1)-manifold, and 
(Fp, Cu(S)),..-, Fp, (2 (s))) must not he in its tangent space—once again, we 
express this by saying that the initial manifold o(D) is free for the initial data 
u, ee Pn. Second, we must have 


Ou a‘ ° 00; 
—__ —_ Pi + —., 
OS; d OS; 


In terms of &, this condition reads 


ay = ax 
ntl = Do Uatiti => 





and 1s called the strip manifold condition. If we think of a point (x1,..., xn, Z, 
Pi,--+, Pn) in R*"*! as a hyperplane in R"*!(,, ...x,,z), namely as the hyper- 
plane perpendicular to the vector (pi,..., Pn, —1), then ©: O > R?"T! may 


be regarded as a family of hyperplanes along the (n — 1)-dimensional subman- 
ifold o(M). It is easy to see that & satisfies the strip manifold condition if and 
only if the tangent space of o() at any point o(s) always lies in the hyper- 
plane determined by & at s. We may summarize by saying that we can solve 
the Cauchy problem for any strip manifold (01, ..., On, U, P1,»--, Pn) for which 
the initial (n — 1)-dimensional submanifold o(D) 1s free for the initial data 


ie) 


oO 
U, P1,-.--, Dn- 


BIBLIOGRAPHY 


This list contains all the books and journal articles that have been referenced, 
as well as others that I have consulted, or that might be considered classics, or 
that might be of interest for further reading, or a source for problems. 


Complete names for the journal references given here are listed on a separate page after 
this Bibliography, since it is often impossible to look up journals without knowing the 
unabbreviated name, and the “standard” abbreviations aren’t always so standard. 


Names of book publishers are generally treated more casually, since most large collected 
works can be found only in libraries, information about publishers is usually easy to find 
on the web, and new printings of old classics are continually appearing, with many new 
and old printings now available on the web. 


Abel, N. H. 
[1] “Auflosung einer mechanischen Aufgabe”, 7 Reme Angew. Math., 1 (1826), 153- 
157. (Also to be found in the first volume of Abel’s Geuvres Completes.) 


Abraham, Ralph and Marsden, Jerrold E. 
[1] Foundations of Mechanics, 2nd ed., Benjamin/Cummings, 1978. 


Alciatore, David G. 
[1] Web site: http://billiards.colostate. edu. 


For the material discussed in Chapter 11, click on “Technical Proof (TP) analy- 
ses”, and scroll down to the TP A section, especially the material from TP A.4. 


Anh, Le Xuan 
[1] Dynamics of Mechamcal Systems with Coulomb Fnction, A. K. Belyaev, trans., 
Springer, 2003. 
Appell, Paul 
[1] Yrarté de Mécanique Rationnelle, 3rd ed., Gauthier-Villars, 1909-1937. 


Arnold, V. I. 
[1] “oB OAHOM TEOPEME AHYBUAAA KACAIOIJEHCA MHTEFPHPYEMBIX HPOBAEM 
AMHAMUKU”, CHBHPCKHH MATEMATHYECKHA DKYPHAA 4 No. 2, 1963. 
[2] Mathematical Methods of Classical Mechanics, K. Vogtmann and A. Weinstein, 
trans., Springer, 1978; 2nd ed., 1989. 
[3] Hluygens and Barrow, Newton and Hooke, Birkhauser, 1990. 


Arnold, V. I. and Avez, A. 
[1] Lxgodic Problems of Classical Mechanics, Benjamin, 1968. 


Atzema, Eisso J. 
[1] “Charles Francois Sturm’s Writings on Optics”, 49-65, in Collected Works of 
Charles Fraingois Sturm, J-C. Pont et al. eds., Birkhauser, 2009. 
Audin, Michéle 
[1] Yorus Actions on Symplectic Manifolds, 2nd edition, Birkhauser, 2004. 
Auerbach, D. 
[1] “Colliding rods: dynamics and relevance to colliding balls”, Am. 7. Phys. 62 
b1O94) 02 2=92): 


693 


694 Bibliography 


Avez, A. See Arnold, V. I. and Avez, A. 


Barbour, J. and Pfister, H. 
[1] Mach’s Principle. From Newton’s Bucket to Quantum Gravity, J. Barbor and 
H. Pfister, eds., Birkhauser, 1995. 


Barger, Vernon and Olsson, Martin 
[1] Classical Mechanics. A Modern Perspective, McGraw-Hill, 1955; 2nd ed., 1973. 


Belorizky, Elie and Sivardieére, Jean 
[1] “Comments on the horizontal deflection of a falling object”, Am. 7% Phys. 55 
(1987), 1103-1104. 


Berry, M. V. 
[1] “Quantal phase factors accompanying adiabatic changes”, Proc. R. Soc. Lond. A 
392 (1984), 45-57. 
[2] “Classical adiabatic angles and quantal adiabatic phase”, 7 Phys. A. Math. Gen. 
18 (1985), 15-27. 


Bertrand, J. 
[1] ““Théoréme relatif au mouvement d’un point attiré vers un centre fixe”, C. R. 


Acad. Sct. Paris 77 (1873), 849-853. 
An English translation is available at arXiv. org/abs/0704.2396 


Beutler, G. 
[1] Methods of Celestial Mechanics, Vol. 1, Springer, 2005. 


Boccaletti, D. and Pucacco, G. 
[1] Theory of Orbits, Vol. 1, Springer, 1996. 
Bohlin, M. K. 


[1] “Note sur le probléme des deux corps et sur une intégration nouvelle dans le 


problemé des trois corps”, Bull. Astr. 28 (1911), 113-119. 


Bolsinov, A. V. and Fomenko, A. 'T- 
[1] Integrable Hamiltoman Systems, Chapman & Hall/CRC, 2004. 


Born, Max 
[1] Vorlesungen tiber Atommechantk, Springer, 1925. 


Available as The Mechanics of the Atom, J. W. Fisher, trans., D. R. Hartree, revi- 
sions, Bell and Sons, 1927. 


Born, Max and Wolf, Emil 
[1] Principles of Optics, 7th ed., Cambridge University Press, 1999. 


Bou-Rabee, Nawaf M., Marsden, Jerrold E., and Romero, Louis A. 
[1] “A geometric treatment of Jellett’s egg”, <. Angew. Math. Mech. 85 (2005), 618- 
642. 


Brogliato, B. 
[1] Monsmooth Mechanics, 2nd ed. Springer, 1999. 


See also Génot, F. 


Bruce, Ian 
[1] Web site: 17centurymaths. com. 


Bibliography 695 


Bruns, Heinrich 
[1] “Das Eikonal’, Abh. Math. Phys. Kl. Konigl. Sachs. Ges. Wiss., 21 (1895), 322— 
436. 


Buchwald, Jed Z. 
[1] The Rese of the Wave Theory of Light, University of Chicago Press, 1989. 


Burgers, J. M. 
[1] “Die adiabatischen Invarianten bedingt periodische Systeme”, Ann. Phystk 52 
(1917), 195-202. 


Cabannes, Henri 
[1] General Mechanics, 2nd ed., 8. P. Sutera, trans., Blasidell, 1968. 


Calkin, M. G. 
[1] Lagrangian and Hamiltonian Mechanics, World Scientific, 1966. 


Calkin, M. G. and March, R. H. 
[1] “The dynamics of a falling chain: [’, Am. F. Phys. 57 (1989), 154-157. 


Carathéodory, Constantin 
[1] Calculus of Variations and Partial Differential Equations of the First Order. Part I, 
Robert B. Dean and Julius J. Brandstatter, trans., Holden-Day, 1965. 


Casey, James 
[1] “The elasticity of wood”, The Physics Teacher, 31 (1993), 286-288. 


Cayley, Arthur 
[1] “On a class of dynamical problems”, Proc. R. Soc. Lond. 8 (1857), 506-511. 


Ceanga, V. and Hurmuzlu, Y. 
[1] “A new look at an old problem: Newton’s cradle”, 7 Appl. Mechanics 68 (2001), 
975-583. 


Chandrasekhar, S. 
[1] Ellipsoidal Figures of Equilibrium, Dover, 1987. A reprint, with revisions, of the 
1969 book published by Yale University Press. 
[2] Newtons Principia for the Common Reader, Clarendon, 1995. 


Chapman, S. 
[1] “Musconception concerning the dynamics of the impact ball apparatus”, Am. 7 
Phys. 28 (1960), 705-711. 


Chicone, C. and Jacobs, M. 
[1] “Bifurcation of critical periods for plane vector fields”, Trans. Am. Math. Soc. 
312 (1989), 433-486. 


Cohen, Bernard, and Whitman, Anne 


See Newton [1]. We use “Cohen-Whitman” whenever we are referring to addi- 
tional commentary, as opposed to the translation itself. 


Cohen, Richard J. 
[1] “The tippe top revisited”, Am. 7 Phys. 45 (1977), 12-17. 


Cordani, Bruno 


[1] The Kepler Problem, Birkhauser, 2003. 


696 Bibliography 


Coriolis, G.-G. 
[1] Théorie Mathématique des Effets de Feu de Billard, together with Coriolis’ papers 
of 1832 and 1835, Editions Jacque Gabay, 1990. 
[2] Mathematical Theory of Spin, Friction, and Collision in the Game of Billiards, David 


Nadler, trans. Privately printed, available at www. coriolisbilliards. com. 


Courant, R. and Hilbert, D. 
[1] Methods of Mathematical Physics, Vol. 2, Wiley & Sons, 1962. 


Delaunay, C. 
[1] ZDhéorne du Mouvement de la lune, tome I, |Mem. Acad. Sct. 28], Mallet-Bachelier, 
1860; tome IT, [Mem. Acad. Sci., 29]. Gauthier-Villars, 1867. 
DG. See Spivak, Michael [2]. 
Dicke, R.H. See Roll, P.G. 
Diyksterhuis, E. J. 
[1] Archimedes, Princeton University Press, 1987. 
Duck, Ian and Sudarshan, E. C. G. 
[1] 100 years of Plank’s Quantum, World Scientific, 2000. 


Dugas, Rene 
(1] A Alistory of Mechanics, J. R. Maddox, trans., Dover, 1988. 


Einstein, A. 
[1] “Zum Quantensatz von Sommerfeld und Epstein”, Verh. Dtsch. Phys. Ges. 19, 
82-92. 
[2] “Prinzipielles zur allgemeinen Relativitatstheoreie”, Ann. Physik 55 (1918), 241- 
244, 
Euclid 
[1] Dhe Arabic Version of Euctd’s Optics. (Kitab Ueglidis fi Ikhtilaf al-manazir), Vol. 1, 
Elaheh Kheirandish trans., Springer, 1999. 
Fasano, Antonio and Marmi, Stefano 
[1] Analytecal Mechanics, Beatrice Pelloni, trans., Oxford University Press, 2006. 
Fenchel, W. 
[1] “On conjugate convex functions”, Can. 7, Math., 1 (1949), 73-77. 


Feynman, R., Leighton, R., and Sands, M. 
[1] Zhe Feynman Lectures on Physics, Volume I, Addison Wesley, 1963. 


Finch, Janet D. See Hand, Louis N. 
Fomenko, A. ‘I. See Bolsinov, A. V. 
French, A. P 
[1] “The deflection of falling objects”, Am. F Phys. 52 (1984), 199. 


Galilei, Galileo 
[1] Dialogues Concerning the Two Chief World Systems, Stillman Drake, trans., Uni- 
versity of California Press, 1967. 
[2] Dzalogues Concerning Two New Sciences, Henry Crew and Alfonso de Salvio, 
trans., Dover, 1954. 


Garwin, Richard L. 
[1] “Kinematics of an Ultraelastic Rough Ball”, Am. 7 Phys. 37 (1969), 88-92. 


Bibliography 697 


Gary, C.G. and Nickel, B. G. 
[1] “Constants of the motion for nonslipping tippe tops and other tops with round 
pegs”, Am. 7% Phys 68 (2000), 821-828. 


Gauss, C. F 
[1] Werke, Vol. 5, 498-503. 


Génot, FE and Brogliato, B. 
[1] “New results on Painlevé paradoxes”, Eur 7 Mech. A Solids 18 (1999), 653-677. 


Giaquinta, Mariano and Hildebrandt, Stefan 
[1] Calculus of Variations I, Springer, 1996. 


Glad, T. See Rauch-Wojciechowski, S. 


Goldstein, Herbert 
[1] Classecal Mechanics, Addison Wesley, 1950. 


Gras-Mati, Albert See Santos-Beniot, Julio, V. 
Griffith, B. See Synge, J. 


Guicciardini, N. 
[1] “Johann Bernoulli, John Keill and the inverse problem of central forces”, Ann. 
of Sct. 52 (1995), 537-575. 


Hale, Jack K. 
[1] Oscellations in Nonlinear Systems, McGraw-Hill, 1963. 
Hall, E. H. 
[1] “Do falling bodies move south?”, Physical Review, 17 (1903), 179-190 and 245— 
254. 


Hamel, Georg 
[1] Theoretische Mechamk, Springer, 1949. 


Hamilton, William Rowan 
[1] Lhe Mathematical Papers of Sir Wilham Rowan Hamilton, 4 Vols., Gambridge 
University Press, 1931. 


Vol. 1 contains the papers on optics, Vol. 2 the two main papers on dynamics. 
Hand, Louis N. and Finch, Janet D. 
[1] Analytical Mechanics, Gambridge University Press, 1998. 
Hannay, J. H. 


[1] “Angle variable holonomy in adiabatic excursion of an integrable Hamiltonian”, 


J. Phys. A. Math. Gen. 18 (1985), 221-230. 
Helmholz, A. Carl See Kittel, Charles 


Herivel, J. 
[1] The Background to Newton’s Principia, Oxford University Press, 1965. 


Herrmann, F. and Schmialzle, P. 
[1] “Simple explanation of a well-known collision experiment”, Am. 7 Phys. 49 
(1981), 761-764. 


Herrmann, F. and Seitz, M. 
[1] “How does the ball-chain work?”, Am. % Phys. 50 (1982), 977-981. 


698 Bibliography 


Herzberger, Max 
[1] Modern Geometrical Optics, Interscience, 1958. 


Hilbert, D. See Courant, R. 
Hildebrandt, Stefan. See Giaquinta, Mariano. 
Hjorth, P G. See Knudsen, J. M. 
Hugenholtz, N. M. 
[1] “On tops rising by friction”, Physica (Amsterdam), 18 (1952), 503-514. 
Hughes, Thomas J. R. See Marsden, Jerrold E. 
Hurmuzlu, Y. See Ceanga, V. 


Huygens, Christaan 
[1] Lhe Pendulum Clock, R. J. Blackwell, trans., lowa State University Press, 1986. 
[2] Treatise on Light, BiblioBazaar, 2007. A reprint of the translation of 1912 by 
Silvanus P. Thompson. 
Jacobi, C.G. J. 
[1] Vorlesungen tiber Dynamik, included as Vol. 8 of his Giuvres Completes. 
Now available as Facobz’s Lectures on Dynamics, Second Edition, Edited by 
A. Clebsch, K. Balagangadharan and Biswarup Banerjee, trans., 
Hindustan Book Agency, 2009. 


Jacobs, M. See Chicone, C. 


Jellett, J. H. 
[1] A Treatise on the Theory of Friction, Hodges, Foster and Co. and MacMillan, 1872. 


Available for reading or download at google. books.com. 


Jost, R. 
[1] “Winkel- und Wirkungsvariable ftir allgemeine mechanische Systeme”, Helvetica 
Physica Acta, 41 (1968), 965-968. 
Kevorkian, J. 
[1] Partial Differential Equations, Wadsworth & Brooks/ Cole, 1990; 2nd ed. Springer, 
2000. 
Khein, Alexander and Nelson, D. F- 
[1] “Hannay angle study of the Foucault pendulum in action-angle variables”, Am. 
JF. Phys. 61 (1993), 170-174. 
[2] “A persistent error in action-angle treatments of Hamiltonian mechanics”, Am. 


J. Phys. 61 (1993), 175-176. 
Kheirandish, Elaheh See Euclid [1]. 


Kittel, Charles, Knight, Walter D., Ruderman, Malvin, A., Helmholz, A. Carl, and 
Moyer, Burton J. 
[1] Mechanics. Berkeley Physics Course. Volume 1, 2nd ed. McGraw-Hill, 1973. 
Klein, Martin J. 
[1] Paul Ehrenfest, North-Holland, 1970. 


Kleppner, D. and Kolenkow, R. 
[1] An Introduction to Mechanics, McGraw-Hill, 1973. 


Bibliography 699 


Knight, Walter D. See Kittel, Charles 


Knudsen, J. M. and Hjorth, P G. 
[1] Elements of Newtonian Mechanics, 3rd ed., Springer, 2000. 


Kolenkow, R. See Kleppner, D. 


Kozlov, V. V. 
[1] Dynamical Systems X. General theory of Vortices, Springer, 2003. 
Krim, Jacqueline 
[1] “Friction at the atomic scale”, Sczentific American, October, 1996, 74—80. 


Krotkov, R. See Roll, P. G. 


Lagrange, J.-L. 
[1] Mécanique Analytique, Gauthier-Villars et fils, 1788. 


Available in English translation as Analytical Mechanics, Auguste Boissonnade 
and Vidor N. Vagliente, eds. and trans., Kluwer, 1997. 


Laidler, Keith J. 
[1] “The meaning of ‘adiabatic’”, Can. # Chem. 72 (1994), 


Landau, L. D. and Lifschitz, E. M. 
[1] Mechanics, 3rd ed., J.B. Sykes and J. S. Bell, trans., Pergamon, 1994. 


Laplace, 
[1] Geuvres Completes, Vol. 14, 267-277. 


Lee, Hwa-Chung 
[1] “The universal integral invariants of Hamiltonian systems and applications to 
the theory of canonical transformation”, Proc. R. Soc. Edinburgh, Sect. A. 11 Part 
3 (1946-48). 
Legendre, A.-M. 
[1] “Mémoire sur Pintegration de quelques équations aux différences partielles”, 
Mémoires de ’Académie Royale des Sciences (1789), 309-351. 
Legendre’s transformation had been given earlier by Euler (cf. Giaquinta and 
Hildebrandt [1; pg. 146]), who wrote mathematics faster than other mathemati- 
cians could read it, but Euler already has enough things named after him. 
Leighton, R.B. See Feynman, R. and Neher, H. V. 


Leutwyler, H. 
[1] “Why some tops tip”, Zur 7 Phys. 15 (1994), 59-61. 


Lifschitz, E.M. See Landau, L. D. 


Liouville, J. 
[1] “Note sur Pintégration des équations différentielles de la Dynamique, présentée 
au Bureau des Longitudes le 29 juin 1853”, F Math. Pures Appl., 20 (1855), 
137-138. 


Lochak, P and Meunier, C. 
[1] Multiphase Averaging for Classical Systems. With Applications to Adiabatic Theo- 
rems, Springer, 1988. 
Mach, E. 
[1] Zhe Science of Mechanics, Open Court Publishing Co., 1974. 


700 Bibliography 


March, R.H. See Calkin, M. G. 
Marmi, Stefano See Fasano, Antonio 
Marsden, Jerrold E. See Abraham, Ralph and Bou-Rabee, Nawaf M. 


Marsden, Jerrold E. and Hughes, Thomas J. R. 
[1] Mathematical Foundations of Elasticity, Prentice-Hall, 1983. 


Marsden, Jerrold E. and Ratiu, Tudor S. 
[1] Lntroduction to Mechanics and Symmetry, 2nd ed., Springer, 1999. 


Maxwell, James Clerk 
[1] The Electrical Researches of Henry Cavendish, edited by Maxwell, Frank Cass & 
Company, Ltd., 1967. 
McOwen, Robert 
[1] Partial Differential Equations, Prentice Hall, 1996. 
Mehra, Jagdish and Rechenberg, Helmut 
[1] Zhe Historical Development of Quantum Theory, Volume | (published in 2 parts), 
Springer, 1982. 
Meunier, C. See Lochak, P. 
Meyer, E. 
[1] Nanosctence: friction and rheology on the nanometer scale, World Scientific, 1998. 
Miller, K.S. and Ross, B. 
[1] An Introduction to the Fractional Calculus and Fractional Differential Equations, 
Wiley & Sons, 1993. 
Mineur, H. 
[1] “Réduction des systemes mécaniques an degrés de liberté admettant n intégrales 
premieres uniformes en involution aux systémes a variables séparées”, 7 Math. 
Pures Appl. 15 (1936), 385-389. 


Misner, C.W., Thorne, K.S., and Wheeler, J. A. 
[1] Gravitation, W. H. Freeman and Company, 1973. 


Moffatt, H. K. and Shimomura, Y. 
[1] “Spinning eggs — a paradox resolved”, Nature 416 (2002), 385-386. 


Morin, David 
[1] Introduction to Classical Mechanics With Problems and Solutions, Gambridge Uni- 
versity Press, 2008. 


Moser, Jiirgen 
[1] “On the volume elements on a manifold”, Trans. Am. Math. Soc. 120 (1965), 
286-294. 


See also Siegel, C. L. 
Moyer, Burton J. See Kittel, Charles 
Mukunda, N. See Sudarshan, E. C. G. 


Nahin, Paul J. 
[1] When Least 1s Best, Princeton University Press, 2004. 


Bibliography 701 


Navarro, L and Perez, E. 
[1] “Paul Ehrenfest on the necessity of quanta (1911): discontinuity, quantization, 
corpuscularity and adiabatic invariance”, Archs. Hist. Exact Sct. 58 (2004) 97— 
141. 
[2] ‘Paul Ehrenfest: ‘The genesis of the adiabatic hypothesis, 1911-1914”, Archs. Hist. 
Exact Scr. 60 (2006) 209-267. 


Neher, H. V. and Leighton, R. B. 
[1] “Linear Air ‘Trough’, Am. F Phys. 31 (1963), 225. 


Nelson, D. EF. See Khein, Alexander 


Newton, I. 
[1] TDhe Correspondence of Isaac Newton, Cambridge University Press, 1959-1977. 
[2] Lhe Principia, Translated by Bernard Cohen and Anne Whitman, University of 
California Press, 1999. 


Nickel, B.G. See Gary, C.G. 
Noether, Emmy 
{1] “Invariante Variationsprobleme”, Nachr. Kgl. Ges. Wiss. Gottg., Math-phys. Klasse 
(1918), 235-257. 
English translation available on-line at arXiv. org/abs/physics/0503066v1. 
Olsson, Martin See Barger, Vernon 


Osgood, W. F 
[1] Afechanics, Macmillan, 1937; Crawford Press, 2007. 
Page, Leigh 
[1] Introduction to theoretical physics, 3rd ed., Van Nostrand, 1952. 


Painlevé, P. 
[1] Legons sur le frottement, Hermann, 1895. 


Available for reading or download at books.google.com (search for the title 
“sur le frottement” to avoid problems with the cedilla). 

[2] “Sur les lois du frottement de glissement”, C. R. Acad. Sct. Paris 121, 112-115; 
141, 401—405; 141, 546-552. 


Palais, Richard S. and Palais, Robert A. 
[1] Differential Equations, Mechanics, and Computation, American Mathematical So- 
ciety and I[AS/Park City Mathematics Institute, 2009. 
Pars, L. A. 
[1] A Yreatise on Analytical Dynamics, Wiley & Sons, 1965. 
Perez, E. 
[1] “Ehrenfest’s adiabatic theory and the old quantum theory”, Archs. Hist. Exact 
Sct. 63 81-125. 
See also Navaro, L. 
Persson, Anders O. 
[1] “The Coriolis Effect: Four centuries of conflict between common sense and 


mathematics, Part I: A history to 1885”, Hestory of Meteorology 2 (2005). See 
meteohistory.org/2005historyofmeteorology2/01persson. pdf. 


702 Bibliography 


Pfister, H. See Barbour, J. 


Planck, Max 
[1] Sczentefic Autobiography, Philosophical Library, New York, 1949. 


Pliskin, W. A. 
[1] “The Tippe Top (Topsy-Turvy Top)’, Am. F Phys. 22 (1954), 28-32. 


Poinsot, M. 
[1] Dhéorte Nouvelle de La Rotation des Corps, Bachelier, 1851. 


Pourciau, B. 
[1] “On Newton’s proof that inverse-square orbits must be conics”, Ann. of Scz., 48 
(1991), 159-172. 
[2] “Newton’s solution of the one-body problem”, Avchs. Hist. Exact Sci. 44 (1992), 
125-146. 
[3] “The integrability of ovals: Newton’s lemma 28 and its counterexamples”, Archs. 
Mist. Exact Sct. 55 (2001), 479-499 
[4] “Newton’s argument for proposition | of the Principia’, Avchs. Hist. Exact Sco. 
57 (2003), 267-311. 
[5] “Force, deflection, and time: Proposition VI of Newton’s Principia”, Azst. Math. 
34 (2007), 140-172. 
[6] “Newton’s Second Law (as Newton Understood it) from Galileo to Laplace” (to 
appear). 
Pucacco, G. See Boccaletti, D. 
Ratiu, Tudor 8S. See Marsden, Jerrold E. 


Rauch-Wojciechowski, $., Skéldstam, M., and Glad, T- 
[1] “Mathematical analysis of the Tippe Top”, Regular and Chaotic Dynamics, 10 
2005, 333-362. 


Rayleigh [Lord Rayleigh] (John Wiliam Strutt, Baron Rayleigh) 
[1] “On the pressure of vibrations”, Philosophical Magazine 3 (1902), 338-346. (Also 
to be found in Vol. 5 of Lord Rayleigh’s Scientefic Papers.) 
Rechenberg, Helmut See Mehra, Jagdish 
Reddingius, E. 
[1] “Comment on “The eastward deflection of a falling object’”, Am. JF Phys. 52 
(1984), 562-563. 


Roll, P G., Krotkov, R., and Dicke, R. H. 
[1] “The equivalence of inertial and passive gravitational mass”, Ann. Phys. (U.S.A.) 
26 (1964), 442-517. 


Romero, Louis A. See Bou-Rabee, Nawaf M. 


Ronchi, Vasco 
[1] Zhe Nature of Light, Harvard University Press, 1970. 


Ross, B. See Miller, K.S. 


Routh, E. J. 
[1] Advanced Dynamics of a System of Rigid Bodies, Dover 1955 reprint. Available 
from Dover Books; original 1905 printing available for reading or download at 
books. google.com. 


Bibliography 703 


Ruderman, Malvin, A. See Kittel, Charles 
Sabra, A. I. 

[1] Theories of Light From Descartes to Newton, Oldbourne, London, 1967. 
sands, M. SeeFeynman, R. 
Santos-Beniot, Julio, V. and Gras-Mati, Albert 

[1] “Ubiquitous drawing errors in the simple pendulum’, 

albertgrasmarti.org/agm/recerca-divulgacio/pendulum-TPT . pdf 

Schmialzle, P See Herrmann, F 
Schrodinger, E. 


[1] “An undulatory theory of the mechanics of atoms and molecules”, Physical Re- 
view 28 (1926), 1049-1070. 


Schulz-DuBois, E. O. 
[1] “Foucault pendulum experiment by Kamerlingh Onnes and degenerate pertur- 
bation theory”, Am. 7. Phys. 38 (1970), 173-188. 
Schwarzschild, K. 
[1] “Zur Quantenhypothese”, Setzungsber. Deut. Akad. Wiss. Berlin Kl. Math. Phys. 
Tech. 16 (1916), 548-568. 
Seitz, M. See Herrmann, F 
Shapiro, A. 
[1] “Bath-Tub Vortex”, Nature, 196 (1962), 1080-1081. 
Shimomura, Y. See Moffatt, H. K. 


Siegel, C. L. and Moser, J. K. 
[1] Lectures on Celestial Mechanics, Springer-Verlag, 1971. 


Siklos, Stephen TC. See Wells, Clive G. 
Sivardiére, Jean. See Belorizky, Ele. 
Skoldstam, M. See Rauch-Wojciechowski, 8. 


Smith, D. E. 
[1] A Source Book in Mathematics, Dover Publications. 


Sommerfeld, Arnold 
[1] Atombau und Spektrallimen, 4th ed., Vieweg und Sohn, 1924 


In the Mathematical Addenda, §6 [as of the 4th edition] introduces the use of 
contour integrals, and §13 says of the use of the method of separation of variables: 
Er ist fiir die Quantenprobleme ein wirklicher ,,Konigsweg“. ‘This was rendered 
in Goldstein [1; §9-8] as “a royal road to quantization”, the standard quotation 
nowadays. 


[2] Mechanics, Academic Press, 1964. 
Spivak, Michael 
[1] Calculus, 4th ed., Publish or Perish, Inc., 2008. 


[2] A Comprehensive Introduction to Differential Geometry, 5 volumes, 3rd ed., Publish 
or Perish, Inc., 2005. 


Strutt, John William See Rayleigh [Lord Rayleigh] 


704 Bibliography 


Sudarshan, E.C.G. See Duck, Ian. 
Sudarshan, E.C.G. and Mukunda, N. 
[1] Classtcal Dynamics: A Modern Perspective, Wiley, 1974. 
Synge, J. and Griffith, B. 
[1] Principles of Mechanics, 3rd ed., McGraw-Hill, 1959. 
Terrall, Mary 
[l] Zhe Man who Flattened the Earth: Maupertuts and the Sciences in the Enlightenment, 
Chicago, 2002. 
Thorne, K.S. See Misner, C. W. 
Tobin, William 
[1] Zhe Lafe and Science of Léon Foucault, Gambridge University Press, 2003. 
Trefethen, L. et al. 
[1] “The Bath-Tub Vortex in the Southern Hemisphere”, Nature, 207 (1965), 1084- 
1085. 
Treschev, Dmitry and Zubelevich, Oleg 
[1] Introduction to the Perturbation Theory of Hamiltonian Systems, Springer, 2010. 
Truesdell, C. 
[1] Essays in the History of Mechanics, Springer-Verlag, 1968. 


van der Waerden, B. L. 
[1] “La démonstration dans les sciences exactes de l’antiquité”, Bull. Soc. Math. 
Belg. 4 (1957), 8-20. 


Vilasi, Gaetano 
[1] Hamiltonian Dynamics, World Scientific, 2001. 


Weinstein, Alan 
[1] “Symplectic structures on Banach manifolds”, Bull. Am. Math. Soc. 75 (1969), 
1040-1041. 
[2] “Symplectic geometry”, Bull. Am. Math. Soc. 5 (July 1981), 1-13. 


Wells, Clive G. and Siklos, Stephen T. C. 
[1] “The adiabatic invariance of the action variable in classical dynamics”, Eur. 7 


Phys. 28 (2007), 105-112. 
Wheeler, J. A. See Misner, C. W. 
Whitman, Anne See Cohen, Bernard 


Whittaker, E. T. 
[1] A Treatise on the Analytical Dynamics of Particles and Rigid Bodies, 4th ed., Gam- 
bridge University Press, 1960. 


Wintner, A. 
[1] Lhe Analytical Foundations of Celestial Mechanics, 1943. 


Wolf, Emil. See Born, Max. 


Wong, GC. W, and Yasui, K. 
[1] “Falling chains”, Am. 7 Phys. 74 (2006), 490-496. 


Yasui, K. See Wong, C. W. 
Zubelevich, Oleg See Treschev, Dmitry 


Bibliography 705 


UNABBREVIATED JOURNAL TITLES 


Abh. Math. Phys. Kl. Konigl. Sachs. Ges. Wiss. = Abhandlungen 
der Mathematisch-Physischen Classe der Koniglich Sachsischen 
Gessellshaft der Wissenshaften 


Am. J. Phys. = American Journal of Physics 

Ann. of Sci. = Annals of Science 

Ann. Phys. = Annals of Physics 

Ann. Physik = Annalen der Physik 

Archs. Hist. Exact Sci. = Archives for History of Exact Sciences 

Bull. Am. Math. Soc. = Bulletin. American Mathematical Society 

Bull. Astr. = Bulletin Astronomique 

Bull. Soc. Math. Belg. = Bulletin. Societé Mathématique de Belgique 

C. R. Acad. Sci. Paris = Comptes Rendus Académie des Sciences (Paris) 

Can. J. Chem. = Canadian Journal of Chemistry 

Can. J. Math. = Canadian Journal of Mathematics 

Eur. J. Mech. A. Solids = European Journal of Mechanics A. Solids 

Eur. J. Phys. = European Journal of Physics 

Hist. Math. = Historia Mathematica 

J. Appl. Mechanics = Journal of Applied Mechanics 

J. Math. Pures Appl. = Journal de mathématiques pures et appliquees 

J. Phys. A. Math. Gen. = Journal of Physics. A. Mathematical and General 

J. Reine Angew. Math. = Journal fiir die Reine und Angewandte Mathematik 

Mem. Acad. Sci. = Mémoires. Académie des Sciences. Institut de France. 

Nachr. Kgl. Ges. Wiss. Gottg., Math-phys. Klasse = Nachricten der Konigliche 
Gesellschaft der Wissenshaften zu Gottingen, Mathematisch-Physischen Klasse 

Proc. R. Soc. Lond. [A.] = Proceedings of the Royal Society of London. [A.] 


Proc. R. Soc. Edinburgh, Sect. A. = Proceedings. Royal Society of Edinburgh. 
Section A. Mathematical and Physical Sciences 


Sitzungsber. Deut. Akad. Wiss. Berlin KI. Math. Phys. Tech. = Sitzungsberichte. 
Deutsche Akademie der Wissenshaften zu Berlin. Klasse ftir Mathematik, 
Physik, und ‘Technik. 


Trans. Am. Math. Soc. = ‘Transactions. American Mathematical Society 
Verh. Dtsch. Phys. Ges. = Verhandlungen. Deutsche Physikalische Gesellshaft 
Z. Angew. Math. Mech. = Zeitschrift fiir Angewandte Mathematik und Mechanik 


INDEX 


a-particles, 148 
Abel, 288 
Abel’s integral equation, 318 
Abelian integrals, 73 
Abraham and Marsden, 404, 590, 596, 
640 
absolute motion, 12 
absolute space, 12, 275, 393 
abuse of notation, 88 
acceleration corrections, 378 
acceleration of gravity, 14 
Achilles, 400 
action, 449, 464, 529, 634 
quantity of, 497 
action integral, 497, 527 
action variables, 631 
action-angle terminology, 640 
action-angle variables, 630, 632, 637 
frequencies of, 635 
addition of angular momentum vec- 
tors, 344 
additivity of mass, 16, 35 
adhesion, molecular, 410 
adiabatic, 644 
adiabatic invariance of J, 651 
adiabatic invariant, 433, 641, 644, 651 
adiabatic limit, 655 
Agamemnon, 400 
Aharonov-Bohm effect, 654 
air trough, 11 
Airy equation, 648 
Airy functions, 648 
Alhazen, 487, 289 
allergy, 513 
alternating (skew-symmetric) map asso- 
ciated with skew-symmetric matrix, 
568 
Amontons, 409 
Amontons’ first law of friction, 409 
Amontons’ second law of friction, 409 
Amontons-Coulomb friction, 4] 1 
amplitude, 287 
infinite, 299 


Analysis ad Refractiones, 490 
analytical mechanics, 439 
specialized considerations for, 440 
analytically defined cross-product, 82 
Aneas, 400 
angle(s) 
apsidal, 128, 132, 135 
cut, 423 
dynamical, 654 
Euler, 3411. 
variants of, 342 
geometric, 655 
Hannay, 653, 655, 659, 664 
of friction, 434 
of incidence, 249, 487 
of reflection, 249, 487 
scattering, 114, 147 
laboratory, 115 
angular coordinates, 623 
angular frequency, 18, 42 
angular momentum, 83 
conservation of, 81, 470, 475 
as vector equation, 87 
elementary illustration of, 86 
using to change directions, 87 
law, 84, 186 
elementary example of, 197 
of system, 83 
per unit mass, 12] 
rotational, 84, 187 
vectors can be added, 344 
angular velocity, 83 
angular wave number, 313 
Anh, 417 
anomaly 
eccentric, 163, 643 
true, 163, 643 
Anschitz-gyroscope, 391 
Anschtitz-Kaempfe, 391 
anti-sync, 479 
oscillations in, 303 
aphelion, 128 
apoapsis, 128 


708 


apocenter, 128 
apogee, 128 
Appell, 172, 37] 
applied mechanics, 110, 206, 417 
approximate solutions, 532 
approximation, seat-of-the-pants, for 
Tippe Top, 428 
approximations to small oscillations, 
292 
apsidal angle, 128, 132, 135 
apsides, 128 
motion of, 129ff. 
apsis, 128 
Aquarius, 361 
aragonite, 509, 526 
Archimedes, 3 
Archimedean solid, 204 
area in polar coordinates, 56, 630, 631 
areas, law of 85 
argument of perihelion, 565 
Arnold, 73, 151, 154, 283, 384, 468, 
561, 573, 590, 592, 603, 621, 640, 
655 
Arnold and Avez, 640 
Arnold tori, 62 
Ars Magna, 356 
artillery shells, 387 
ascending node, 565 
asteroid companions of Earth, 401 
asteroid, [rojan, 396 
Atwood, 242 
Atwood machine, 241, 242 
usual analysis of, 245 
with massive wheel, 246 
Atzema, 499 
Audin, 640 
Auerbach, 110 
averaged equation, 647 
averaging principle, 646, 648 
averaging theorem, 648ff- 
ax1s 
parallel axis theorem, 191, 196 
perpendicular axis theorem, 202 
principal, of inertia, 189 
rotation about, 191, 193 
rotation about, rigid body, 334 
stable and unstable, 335 


Index 


balance, 54 

torsion, 39 
Barbour and Pfister, 395 
Barger and Olsson, 251, 428 
base curve, 683 
basis, symplectic, 601 
bath-tubs, 392 
baton, 81 
beats, 299, 309, 390 
Belorizky and Sivardiére, 386 
Beltrami’s identity, 271 
bending of solid bodies, 206 
bending, thin plank, 256 
Benedetti, 19 
Benzenberg, 384 
Bernoulli, 257 

Euler-Bernoulli equation, 256 
Bernoulli, Daniel, 279 
Bernoulli, Johann, 282 
Berra (Yogi), 461 
Berry, 654 
Bertrand, 151, 172 
Bessel, 38, 259 
Beutler, 404 
bi-axial crystal, 526 
bilinear covariant, 590 
billiard balls, collision of 423 
billiard shot 

high, 418 

low, 419 
billiard tables, cushions of, height of, 

418 

billiards 

draw shot, 419 

follow shot, 419 

game of, 417 

role of friction in, 417 
billiards.colostate.edu, 420 
blackboard, chalk on, 412, 414 
Boccaletti and Pucacco, 404, 559, 640 
body cone, 341 
body coordinates, free symmetric top 

in, 339 

body derivative, 333 
body, rigid, 491 
body, rotating axes within, 424 
Bohlin, 154 


Bohnenberger, 356 
Bohr, 427, 558, 640, 641, 646 
Bolsinov and Fomenko, 641 
Boltzmann, 646 
Born, 640, 641, 645 
Born and Wolf, 501 502, 563 
Bou-Rabee, Marsden, and Romero, 
433 
Boundary ‘Term Corollary, 462, 468, 
539, 572 
Bracket Formula, Cartan’s, 576, 595 
bracket, Poisson, 605 
invariant definition of, 606 
well-defined, 605 
braking, self-, 414 
bridge, suspension, 269 
Brogliato, 417 
Bruce, 288 
Bruns, 555 
Buchwald, 492, 494, 498 
bucket, rotating, 393 
buckling, column, 257 
bulging of earth near equator, 20-21, 
405 
effect on plumb bob, 381 
Burgers, 646 


Cabannes, 374, 452 
cable, hanging, 268-271 
calculus, fractional, 319 
Calkin, 642, 656 
Calkin and March, 106 
caloric, 498 
camera obscura, 487 
camera, pin-hole, 487 
cannonball, 276 
deflection of, 386 
shot upwards, 386 
canonical coordinates, 571 
canonical equations, Hamilton’s, 540, 
546, 547 
canonical form, in linearization, 324 
canonical transformation, 567, 569, 
576 
generalized, 571, 593 
homogeneous, 596 
|-parameter family of, 575, 576 


Index 709 


time-dependent, 581. 
real usefulness of, 586 
canonical, as symplectic, 603 
canonically conjugate, 571 
capstan, 435 
Carathéodory, 603 
Cardan suspension, 356 
Cardan, Jerome (Cardano, Girolamo), 
356 
Carlyle, 466 
Carnot energy loss, 98 
temperature change of, 98 
Carnot, Lazarus, 98 
Carnot, Sadi, 98 
Cartan, see Poincaré—Cartan 
Cartan formula, 576 
Cartan’s Bracket Formula, 576, 595 
Cartan’s Magic Formula, 576, 595 
Casey, 257 
Cauchy problem, 672 
Cavendish, 39 
Cayley, 102, 109 
Ceanga and Hurmuzlu, 110 
center, fall into, 160 
center of gravity, 5, 81, 204 
center of mass, 80, 190 
coordinates, 94 
central force, 85, 552, 553 
not radially symmetric, 116 
radially symmetric, 74, 88 
towards origin, 7] 
Newton’s analysis of, 55ff. 
centrifugal force, 38, 377, 378, 396, 
and rocket launches, 117 
centripetal force, 365, 378 
Chadwick, 113 
chain, 1 OOff.- 
folded, falling, 103 
link of, 102-105 
yanked, 102, 105 
chain and sprocket wheel, 246 
chalk on blackboard, 412, 414 
chandelier, 307 
Chandler, 340 
Chandler wobble, 340 
Chandrasekhar, 7, 21, 73, 128, 138, 
142, 143, 466 


710 


change of momentum, 12 
Chapman, 110 
characteristic curve, 590, 669, 674, 
681, 689 
characteristic for, 677, 699 
characteristic function 
for mechanics, 527 
Hamilton’s, 554 
Hamilton-Jacobi equation for, 554 
in optics, 524, 527 
characteristic strip, 683, 690 
characteristic subspace, 590 
characteristic surface, 562 
characteristic vector field, 669, 681, 
689, 674 
charged particle, moving, 52 
Chicone and Jacobs, 133 
China, 356 
circle, osculating, 62 
circuit, electrical, 293 
damping for, 295 
circular frequencies of invariant torus, 
621 
circular frequency, 287 
of driving force, 298 
circular orbits, 67, 121 
not stable for inverse cube force, 135 
circulation, 573 
circumference of earth, 67—68 
Clairaut equation, 518, 543 
cn, elliptic function, 372 
coefficient of friction, 225 
coefficient of kinetic friction, 41] 
coefficient of restitution, 94 
Cohen, 430 
Cohen and Whitman, 7, 55, 67, 68, 
73, 142, 143, 283, 361 
college ring, 426, 430 
collinear forces, 53 
collisions 
of rigid bodies, 206 
conservation of energy in, 94, 206 
“head-on”, 97 
more general, 97 
color, 495 
column buckling, 257 
complete integral, 540, 541 


Index 


completely elastic, 94, 96 
completely inelastic, 94, 96, 98, 102 
composition of forces, 28, 30 
compound pendulum, 211 
condensed notation, 443, 529, 53] 
conditional critical points, Lagrange 
multipliers for, 471 
Euler’s Rule for, 474 
conditionally periodic, 620 
cone 
body, 341 
Monge, 679 
polhode, 366 
space, 34] 
configuration space, 179 
for pendulum, 209 
coordinate system on, 209 
for rolling disc, 231 
congruence of matrices, 568 
conic section orbit, centered at focus, 
124 
conical refraction, 526 
external, 526 
internal, 526 
conjugate, 525 
canonically, 571 
hyperbola, 159 
momentum, 445, 634 
connection, 463 
conservation laws, 79 
conservation of angular momentum, 
81, 470, 475 
as vector equation, 87 
elementary ilustration of, 86 
using to change direction, 87 
conservation of energy, 87, 89, 102, 
420 
in collisions, 94 
in general, 97 
in Principia, 14) 
conservation of energy equation, 92 
conservation of kinetic energy and of 
momentum, 113, 206 
conservation of momentum, 22, 32, 80, 
469 
first statement of, 273 
Newton’s statement of, 273 


conservation of momentum and con- 
servation of kinetic energy, 113, 206 
conservative force, 47, 90, 92 
Lagrange’s equations for, 475 
rigid body in field of, 194 
constant energy paths, 464 
Euler for, 464 
constraint, 205ff. 
d’Alembert’s principle for, 210 
differential, 230f1. 
d’Alembert’s principle for, 233 
forces, finding with Lagrange multi- 
pliers, 238, 239 
holonomic, 230f1- 
ideal, 210 
meaning of, 210 
problems, Lagrange’s equations with, 
445 
time-dependent, 227, 449 
contact, 110 
contact curves, 543 
contact point, velocity in rolling, 221, 
Zoe 
continuous body, 195 
continuous force, 13 
continuous string, 314 
continuum mechanics, 468 
contour integration, 636, 663 
contraction, 512 
convolution, 319 
coordinates 
angular, 623 
canonical, 571 
center of mass, 94 
cyclic, 445, 468 
elliptic, 559 
for three dimensions, 561 
ignorable, 4495 
laboratory, 115 
multiple-valued, 623 
spherical, 290 
“coordinate subspaces’, isotropic, 601 
coordinate system 
non-inertial, 332 
rotating, 332 
symplectic, 604 
Cordani, 592 


Index 711 


Coriolis, 420 
Coriolis force, 377, 381, 396, 420 
amusing movie about, 38] 
on rotating platform, 381 
corrections, acceleration, 378 
cotangent bundle, 511 
invariant definition of 1-form @ on, 
511 
special features of, 511 
2-form w on, 512 
Cotes’ spiral, 127 
Coulomb, 410 
friction, 411 
Coulomb-Morin friction, 41] 
couple, 201 
coupled oscillators, 302, 479 
coupling, 94, 98 
weak, 304 
Courant and Hilbert, 562 
covariant derivative, 482 
covariant, bilinear, 590 
coy physicists, 117 
critical point 
in calculus of variations, 461 
conditional, Lagrange multipliers for, 
47\ 

Euler’s Rule for, 474 
critically damped, 297 
cross-product, 55, 81, 186 

analytically defined, 82 
geometrically defined, 82 
crystal, 494 
bi-axial, 526 
crystal ball, vin 
curvature, 62 
formula for, 374 
curve traced by axis of top in cuspidal 
case, perpendicularity of, 349 
cushions of billiard tables, height of, 
418 
cusp in envelope, 321 
cuspidal case, for top, 349ff., 356 
cuspidal polar top, 355 
cut angle, 423 
cyclic coordinates, 445, 468 
cycloid, 221, 236, 288, 327 
geometric proofs for, 289 


712 


involute of, 327 
cycloidal pendulum, 288 


d’Alembert’s formula, 502, 505 
d’Alembert’s principle, 184ff. 
for constraints, 210 
for differential constraints, 233 
with Lagrange’s equations, 44] 
da Vinci, 409 
damped forced oscillations, 300 
underdamped case, 301 
damped oscillations, 295, 390 
damped, critically, 297 
second solution for, 297 
damping force, 295 
Darboux, 172, 604 
equation, 504 
de Broglie, 555, 557 
decay, exponential, 296 
decompose forces, 30 
decrease in weight approaching equa- 
tor, 380 
defined operationally, 8 
deflection 
of cannonball, 386 
of falling body, 382 
naive calculation for, 383 
of hanging body, 379 
southward, 380 
southward, 386, 405 
degeneracy, 636, 637 
Delaunay, 640 
variables, 640 
densities, spherically symmetrical, 137 
density, of continuous body, 195 
dependence, domain of, 502, 506, 507 
derivative 
body, 333 
covariant, 482 
fibre, 517 
fibre-wise, 517 
rotating observer’s, 333 
Desaguliers, 410, 411 
Descartes, 273, 275, 488 
Descartes’ Law, 488 
descending node, 565 
descent, method of, 506 


Index 


descent, time of, 289, 385 
developable surfaces, 519 
deviousness of Leibnizian notation, 122 
diagonalization of matrices, simultane- 
ous, 478 
diffeomorphism, 608 
differential constraints, 230ff. 
d’Alembert’s principle for, 233 
differential equations, stability of solu- 
tions for, 324 
diffraction, 500, 563 
Dyyksterhuis, 6 
direction of down, 381 
direction of force, 9 
disc, rolling, 230, 362 
along circle, 365, 454 
along straight line, 364, 454 
stability of, 364 
compared to rolling sphere, 230, 240 
down inclined plane, 235 
Lagrange’s equations for, 452 
non-upright, 23] 
discrete subgroup of R” see subgroup 
of R”, discrete 
disembodied ghost, 637, 639 
displacements, virtual infinitesimal, 
178 ff. 
distinction between mass and weight, 
20 
distribution, not integrable, 232 
div, 503 
divergence, 138 
divergence theorem, 138, 503 
dn, elliptic function, 372 
domain of dependence, 502, 506, 507 
double image, 494 
double pendulum, 229, 307, 480 
tension on string for, 480 
downward direction, 381 
draw shot, billiards, 419 
driving force, 298 
circular frequency of, 298 
oscillating, 298 
duality and power force laws, 154 
Dublin, University of, 522 
Duck and Sudarshan, 640 
Dugas, 5, 466, 490, 499 


dynamical angle, 654 
“dynamics reduces to statics”, 184 


EGtv6s, 38, 379 
earth 
bulge near equator, 20-21, 405 
circumference of, 67—68 
radius of, 16 
rotating, 376, 377 
earth-moon system, 25 
Earth ‘Trojans, 401 
Earth, asteroid companions of, 401 
eccentric anomaly, 163, 643 
eccentricity, 166 
eclipses of Jupiter’s satellites, 491 
ecliptic, plane of, 565 
effective mass, 248 
effective potential energy, 144 
eggs, hard boiled, 426, 432 
Ehrenfest, 646 
eikonal, 555 
Einstein, 17, 39, 108, 394, 640, 646 
letter to Mach, 394 
Einstein’s relativity principle, 275 
elastic 
body, 
49] 
completely, 94, 96 
perfectly, 206, 249 
limit, 292 
elasticity, modulus of, 253 
electric charge, 24 
opposite, 24 
electric field, 138, 199 
electrical circuit, 293 
damping for, 295 
electrical examples, for harmonic mo- 
tion, 294 
electricity, 52 
electromagnetic waves, 501 
electromagnetism, 200 
ellipse, 


compared to rotating elliptical path, 


202 
focal point property of, 72 
ellipsoid, inertia, 189 
rolling, 337 


Index 713 


ellipsoid of revolution, 494 
elliptic 
coordinates, 559 
for three dimensions, 561 
function, 75, 372, 347 
functions cn, dn, sn, 372 
integral, 51, 73, 101 
elliptical 
harmonic motion, 71 
orbit, 16 
of planets, 37 
centered at focus, 124 
orbit, centered at origin, 123 
orbit, revolving, 131, 166 
under central force towards origin, 
71 
energy, 529 
conservation of, 87, 89, 92, 102, 420 
in collisions, 94 
in general, 97 
in Principia, 141 
in Lagrangian, 448 
kinetic, 87, 88 
factor of 1/2 in, 420 
loss of, 97 
of rigid body, 193 
rotational part, 194, 204 
for continuous body, 196, 204 
translational part, 194 
loss, Carnot, 98, 102 
of curve, 463 
paths of constant, 464 
Euler for, 464 
per unit mass, 12] 
potential, 87, 89, 90 
effective, 144 
in Principia, 142 
only determined up to a constant, 
89 
transfer, 305, 309 
not complete, 306 
engineering 
mechanical, 110, 206, 417 
problems, 257 
english, side, 421 
envelope, 288, 321, 491 
cusp in, 321 


714 


of family of surfaces, 323 
of normals to an ellipse, 321 
of secondary waves, 491 
of solutions, 542 
“equal” solutions for Lagrangians, 459 
equal forces, 14 
equator 
bulging of earth near, 20-21, 405 
effect on plumb bob, 381 
decrease in weight approaching, 380 
equiangular spiral, 126 
equilibrium, 21 3ff. 
and potential function, 214 
of rigid body, 175 
rigid, 176 
stable, 215, 476 
unstable, 477 
equilibrium of planes, 5 
equilibrium point, small oscillations 
about, 478 
equilibrium points for Lagrange equa- 
tions, 476 
equinox(es) 
precession of, 340, 361 
by d’Alembert, 361 
exercise in Goldstein [1], 361 
Newton’s calculations, 361 
spring, 361 
vernal, 565 
equipotential surfaces, 169 
equivalent forces, 201 
escape velocity, 117 
ether, particles of, 491, 563 
Euler, 257, 332, 381, 398, 410, 466, 
497, 559 
angles, 341ff. 
variants of, 341 
equation, 461, 462 
Boundary Term Corollary, 462 
equations, 318, 333, 334 
dynamic, 333, 334 
geometric, 344 
for rotating principal vectors, 362 
force, 377, 381 
precession, 339, 381 
Euler’s dynamic equations, 344 
Euler’s geometric equations, 344 


Index 


Euler’s Rule, 474 
Euler’s theorem for homogeneous qua- 
dratic functions, 448 
Euler-Bernoulli equation, 256 
Euler-Lagrange equations, 463 
Euler-Poisson-Darboux equation, 504 
evolute, 288 
evolution, 648 
exhaustion, method of, 4 
exponential decay, 296 
extended Hamilton’s principle, 534, 
583, 592 
paradoxical fact about, 535 
external conical refraction, 526 
external force, 29, 79 
extremal, in calculus of variations, 46] 
eye-glasses (spectacles), 488 


{fall into center, 160 
falling body, deflection of, 382 
naive calculation for, 383 
falling chain, folded, 103 
falling faster than freely falling 
chain, 105 
infinite speed, 106 
family of surfaces, envelope of, 323 
Fasano and Marmi, 561, 640 
fast precession, 353 
fast top, 349f1. 
Fenchel, 514 
Fermat, 489 
Fermat’s principle, 489, 522, 524, 527 
Feynman, 26, 97ff., 216, 394, 411, 490 
fibre derivative, 517 
fibre-wise derivative, 517 
fibre-wise Legendre transform, 517, 
528 
fictitious forces, 376, 377 
fifth power, inverse, 70 
Fignewton, Dr., 43 
filament, 241 
wrapped around object, 245, 267 
Fine Hall, 108 
finite propagation speed, 502 
Finsler metric, 495 
first integral, 93 
first law, Newton’s, 10 


first order PDE, 540, 542, 668, 673, 
677 
linear, 668 
most general, 677 
quasi-linear, 673 
fixed masses, two, 559 
fixed stars, 11 
Flachensatz, 85 
flip over, gyroscope, 357, 358 
flow of Hamiltonian vector field, 575, 
576 
flux, 139 
focal point property of ellipse, 72 
focused image of lens, 488 
fogeys, old, 412 
folded chain, falling, 103 
foliation, 323 
follow shot, billiard, 419 
football, spinning, 432 
force(s), 8, 9 
1$* power, 123 
n> power, 122ff, 135 
azimuthal, 377, 381 
central, towards origin, 71 
centrifugal, 38, 377, 378, 396 
centripetal, 365, 378 
collinear, 53 
composition of, 28, 30 
conservative, 90, 92, 475 
Lagrange’s equations for, 475 
rigid body in, 194 
constraint, 
finding with Lagrange multipliers, 
238 
finding with Lagrange multipliers, 
in holonomic case, 239 
Coriolis, 377, 381, 396, 420 
amusing move about, 38] 
on rotating platform, 381 
damping, 295 
proportional to velocity, 295 
decompose, 30 
direction of, 9 
driving, 298 
oscillating, 298 
equal, 14 
equivalent, 201 


Index 715 


Euler, 377, 381 
external, 29, 79 
ficticious, 376, 377 
frictional, 409 
generalized, 443 
gravitation, 14 
independent of velocity, 17 
normal component of, 225 
impulsive, 12, 57, 278, 279 
intermolecular, 216 
internal, 29, 176 
for pendulum (tension), 210 
for rigid body, work done by, 195 
non-unique for rigid body, 178, 
184 
work done by, for rigid body, 195 
inverse cube, 126 
circular orbits not stable for, 126 
inverse square, 58, 60, 123 
exactness of, 135 
force, moment of, 85 
in friction, 409 
translational, 377 
forced oscillation, 298 
damped, 300 
underdamped case, 301 
resolution of, 30 
fork, tuning, 295 
frequency of, 295 
standard, 295 
Foucault, 356 
Foucault’s gyroscope, 381 
Foucault’s pendulum, 387, 661, 664 
intuitive, geometric, explanation for, 
388 
path of, 389 
cusps in, 390 
loops in, 390 
popular explanation for, 387 
small oscillations of, 389 
Fourier series, 310, 315 
fractional calculus, 319 
free curve, 672 
free for initial condition, 690 
free for initial data, 688, 691 
free symmetric top, 339 
in body coordinates, 339 


716 


in inertial coordinates, 340 
freely falling chain, falling faster than, 
105 
infinite speed, 106 
French, 405 
frequencies of action-angle variables, 
635 
frequencies of invariant torus, 621 
frequency, 287 
angular, 18, 42 
circular, 287 
of driving force, 298 
Fresnel, 501,524, 526, 563 
friction, 216, 352, 409 
Amontons-Coulomb, 411 
and rolling, 221 
angle of, 434 
coefficient of, 225 
Coulomb, 411 
Coulomb-Morin, 411 
in opposite direction of motion, 412 
increased, 411 
kinetic, 410 
coefficient of, 411 
laws of, 409 
normal component of force in, 409 
proportional to normal component 
of gravitational force, 225 
role in billiards, 417 
rolling, 411 
shding, 410 
sliding without, 216 
static 410, 411 
torque coming from, 427 
viscous, 411] 
frictional force, 409 
Frobenius integrability, 232 
fulcrum, 183 
functions, homogeneous quadratic, 447 
homogeneous quadratic, Euler’s the- 
orem on, 448 
fundamental Poisson brackets, 607, 


609 


Génot and Brogliato, 416 
Galilean invariance, 273 
Galilean relativity principle, 275 


Index 


Galileo, 10, 13, 14, 19, 87, 220, 224, 
275, 276, 279, 382, 383, 488 
Gamma function, 319 
Garwin, 251 
gauge transformation, 459 
Gauss, 384 
Gauss’ law, 139 
Gauss’ lemma, 525 
Gauss’ theorem, 138 
general relativity, 39, 394 
general relativity, Schwarzschild solu- 
tion in, 640 
generalized canonical transformation, 
o71,093 
generalized forces, 443 
generating function(s), 577, 583 
disembodied ghost of, 637, 638 
for Hamilton’s equations, 585 
in time-independent case, 586ff. 
type 1, 578, 583 
type 2, 580, 584, 617, 633, 652, 653 
type 3 and type 4, 588, 589, 597 
mixture of types, 589 
other types, 588 
geodesics, 39, 463, 524 
on Riemannian manifolds, 461 
standard from for equations, 482 
geometric 
angle, 655 
definition of Legendre transform, 
935 
description of rigid body motion, 
33681. 
proof of Kepler’s second law, New- 
ton’s, 56 
proof, Newton’s concerning gravity, 
137, 138 
proofs for cycloid, 289 
Geometrical optics, 500, 524, 557 
geometrically defined cross-product, 82 
Gergonne, 499 
ghost, disembodied, 637, 639 
Giaquinta and Hildebrant, 594 
Gilbert and Sullivan, 420 
gimbal, 355, 381 
direction of movement, 357 


gliders, 217 


God (supreme being) 
existence of, 464, 498 
Goldstein, 361, 567 
grad, 90, 482 
Gramm-Schmidt orthogonalization, 
600 
gravitation force, normal component 
of, 225 
gravity, 9, 
acceleration of, 14 
center of, 5, 81, 204 
force of, 14 
independent of velocity, 17 
law of, 37 
modern-day problem involving, 559 
third law for, 39 
uniform, 13, 14, 204 
Gray and Nickel, 431 
Green’s theorem, 138 
Grimaldi, 500, 501 
Gronwall’s inequality, 165 
Guicciardini, 282 
gyration, radius of, 213 
gyrocompass, 358, 359 
period of, 360 
gyroscope, 340, 355, 391 
gyroscope, Anschiitz-, 391 
gyroscope, flip over, 357, 358 
gyroscope, loucault’s, 381 
gimbal, direction of movement, 357 
nutation of, 357 
opposite direction for spin, 357, 358 
precession of, 356 


Hadamard, 506 

Hair, 361 

Hale, 659 

Hall, 405 

Halley, 20, 67 

Halphen (5th order equation of), 172 

Hamel, 417 

Hamilton, 279, 522, 528, 540 

Hamilton’s 
canonical equations, 540, 546, 547 
characteristic function, 554 
characteristic function, Hamilton- 

Jacobi equation for, 554 


Index 717 


equations, using generating functions 
to simplify, 585 
principle, 463, 498, 527 
use of “principle of least action” 
for, 466 
extended, 534, 583, 592 
paradoxical seeming fact about, 
535 
Hamilton-Jacobi equation, 540, 546 
for Hamilton’s characteristic func- 
tion, 554 
in usual notation for first order 
PDE’s, 546 
reduced, 554 
Hamilton-Jacobi theory, 540 
Hamiltonian, 533 
examples of, 539 
flows, 572 
form of Noether’s theorem, 606 
local, 577 
mechanics, 487, 509, 532 
relation to quantum mechanics, 
487 
time-dependent, 590 
time-independent, 590 
vector field, 569, 608 
flow of, 575, 576 
Hand and Finch, 21, 361 
hanging body 
deflection of, 379 
southward, 380 
not possible to detect, 381 
hanging cable, 268-271 
Hannay, 654, 655 
angle, 653, 655, 659, 664 
hoop, 656 
hard boiled eggs, 426, 432 
harmonic motion, 287, 289, 293 
electrical examples for, 294 
elliptical, 71 
harmonic oscillator, 550, 623, 634, 637 
harmonics, 316 
Harriot, 488 
heavy top, see top, heavy 
height of cushions of billiard tables, 
418 
Hektor, 400 


718 


helium nuclei, 148 
Herivel, 67 
Hero of Alexandria, 487 
herpolhode, 338 
convexity of, 338, 366ff 
repeating sections of, 338 
Herrmann and Schmiize, 110 
Herrmann and Seitz, 110 
hidden planet, 404 
Higgs potential, 166 
high shot, buliard 418 
hinges, 242 
and constraints, 227 
historical questions, 273 
Hogwarts, 576 
holonomic constraints, 230ff. 
homo sapiens, 220 
homogeneity of space, 469 
homogeneous, 116 
homogeneous canonical transforma- 
tion, 596 
homogeneous media, 494 
homogeneous quadratic functions, 447 
Euler’s theorem on, 448 
“hook” notation, 512 
Hooke, 67, 382, 383 
Hooke’s law, 47, 252, 292 
for plank or rod, 253 
Hugenholtz, 430 
hurricanes, 392 
Huygens, 22, 26, 112, 273, 275, 288, 
490, 494 
Huygens’ 
construction, 492, 495, 501, 502, 
907, 524, 562 
principle, 502, 507, 562 
secondary waves, 563 
hydrodynamics, 573 
hyperbola 
conjugate, 159 
hyperbolic equation, 316 
alternate normal form for, 316 
normal form for, 316 
hyperbolic, orbit, 126 
centered at focus, 124 
hyperbolic spiral, 126 
hypergeometric equation, 455 


Index 


Iceland spar, 494, 500, 526 
ideal constraint, 210 
ignorable coordinates, 445 
image, double, 494 
impact line, 423 
impact parameter, 147 
smallest value of, 150 
impulsive force, 12, 13, 57, 278, 279 
incidence, angle of, 249, 487 
incident ray, 522 
inclination, 565 
inclined plane, sliding down, 215 
increased friction, 41] 
independence of integrals, 615 
indeterminate, statically, 206, 252 
index, refractive, 497 
indicatrices, discontinuously changing, 
4.95 
indicatrix, 495 
inelastic, completely, 94, 96, 98, 102 
inequality, Gronwall’s, 165 
inertia ellipsoid, 189 
rolling, 337 
inertia tensor, 186ff, 189 
calculating, 189 
matrix of, 189 
of continuous body, 195 
inertia, 9, 11 
law of, 11 
moment of, 190, 191 
for continuous body, 196, 203 
principal axis of, 189 
principal moment of, 189 
product of, 190 
inertial coordinates, free symmetric top 
in, 340 
inertial system, 11 
existence of, 275 
infinite amplitude, 299 
infinitesimal displacements, virtual, 
17 8ff. 
infinitesimal, virtual work, 181 
influence, range of, 502, 506, 507 
inhomogeneous equation, 298 
ner product, 599 
matrix of 599 
rank of 599 


Inquisition, 276 
instrument, optical, 524 
integrability theorem, Liouville, see Li- 
ouville integrability theorem 
integrability, Liouville 614, 641 
integral(s) 
complete, 540, 541 
contour, 663 
first, 93 
Jellett’s, 425 
independence of, 615 
integral equation, 288, 318 
integral for solutions of Lagrange’s 
equations, 468 
integral form of remainder for Taylor’s 
formula, 659 
integral invariant of Poincaré—Cartan, 
574 
integral invariants, 572 
of Poincaré, 572 
relative, of Poincaré, 573 
integral of motion, 605 
integration, contour, 636 
interference experiment, Young’s 500 
interior product, 512 
intermolecular forces, 216 
internal conical refraction, 526 
internal forces, 29, 176 
for pendulum (tension), 210 
for rigid body, work done by, 195 
non-unique for rigid body, 178, 184 
work done by, for rigid body, 195 
invariable plane, 337 
invariance, Galilean, 273 
invariance of Lagrange’s equations, 
463, 532 
invariance of nature under orthogonal 
maps, 280 
invariant definitions, 532, 533 
invariant definition of 1-form @ on 
cotangent bundle, 511 
invariant(s) 
adiabatic, 433 
integral, of Poincaré—Cartan, 574 
Jellett, 424 
adiabatic, 641, 644 
integral, 572 


Index 719 


of Poincaré, 572 
relative, of Poincaré, 573 
universal, 593, 598 
invariant torus, 621, 623 
theorem, 621 
circular frequencies of, 621 
frequencies of, 621 
periods of, 621 
inverse cube force, 126 
circular orbits not stable for, 126 
Newton’s investigation of, 127-128 
inverse fifth power, 70 
inverse square force, 16, 58, 60, 67, 75, 
123, 552, 553, 635, 637 
exactness of, 135 
for spherical bodies, 137 
involute(s), 288, 327 
infinitely many, 327 
parallel curve, 327 
involution, functions in, 614 
Insh Academy, Transactions of the Royal, 
927 
Islamic scholars, 487 
isoperimetric problem, 657 
isotropic “coordinate subspaces’, 601 
isotropic media, 494 
isotropic subspace, 601 
isotropy lemma, 615 


J, adiabatic invariance of, 651 
Jacobi, 532, 540 
metric, 467 
607, 610, 613 
Jacobr’s 
Theorem, 547, 549, 585 
and mechanics, 549 
version of principle of least action, 
466 
Jellett, 412 
Jellett’s 
constant, 424 
integral, 425 
rediscovery of, 430 
invariant, 424 
Jost, 640, 641 
Joule, 98 


720 


Journal of Irreproducible Results, 43 
Jupiter, satellites of, 491 
Just So Stories, 527 


Konig, 466 
Karcher, 170 
Kater pendulum, 258 
kenyon. edu, 242 
Kepler, 488 
problem, 162ff. 
Kepler’s 
equation, 163-164 
second law, 55 
Newton’s geometric proof, 56ff- 
third law, 37, 67, 125, 162 
Khein and Nelson, 661, 663, 664 
kinetic energy, 87, 88 
factor of 1/2 in, 420 
of rigid body, 193 
loss of, 97 
rotational part, 194, 204 
for continuous body, 196, 204 
translational part, 194 
kinetic friction, 410 
coefficient of, 411 
Kipling, 527 
Kirchhoff, 501, 563 
Kirchhoff’s formula, 506 
Kittel, 388 
Klein, 646 
Klein, Painlevé-Klein problem, 412 
Kleppner and Kolenkow, 298 
Kozlov, 573 
Krim, 41] 


lab top, 345 
laboratory coordinates, 115 
laboratory scattering angle, 115 
Lacroix, 499 
Lagrange, 398, 466, 490, 499 
Lagrange mulupliers, 233, 451 
for conditional critical points, 471 
Euler’s Rule for, 474 
finding constraint forces with, 238 
in holonomic case, 239 
for Euler equation, 27] 
Lagrange points, 398 


Index 


periodic orbits near, 400 
stability of, 399, 401 
Lagrange’s equations, 441, 443, 528 
constraint problems with, 445 
d’Alembert’s Principle with, 441 
equal solutions for, 459 
equilibrium points for, 476 
for rolling disc, 452 
invariance of, 532 
same solutions for, 459 
solutions for, integral for, 468 
third law built into, 470 
Lagrangian, 346, 443 
arbitrary, 445 
for tension on pendulum string, 457 
preserving, 467 
regular, 482, 517, 533 
time-dependent, 449 
conservation of energy for, 450 
Lagrangian mechanics, 220, 226, 230, 
437, 439 
Laidler, 646 
lamina, 202 
Landau and Lifschitz, 319, 372, 539, 
655 
Laplace, 167, 279, 384, 466, 499 
Laplace transform, 319 
latitude, 170 
latus rectum, 61 
law of areas, 85 
law of gravity, 37 
law of inertia, 11 
law of the lever, 3, 30 
laws of friction, 409 
least action, principle of, 464, 490, 
499,922 5.921 ;092 
additional hypothesis for, 464 
Euler’s role in, 466 
for Hamilton’s principle 466 
Jacobr’s version, 466 
Lee, 593 
Legendre transform, 513, 518, 597 
classical notation for, 515, 519 
fibre-wise, 517, 528 
generalized, 515, 516 
geometric definition of, 513, 535 
original form, 514 


involutive, 514 
Lehrer, 599 
Leibniz, 275, 466 
Leibnizian notation, 75, 88, 121 
deviousness of, 122 
old-fashioned trick for, 75, 88, 100 
length, wave, 313 
lenses, 500 
image focused for, 488 
Leutwyler, 430 
level sets of V, 526 
lever, 3, 30, 183 
libration, 626 
libration points, 398 
periodic orbits near, 400 
stability of, 399, 401 
light, 487 
analogous to sound, 490 
mechanical analogies for, 487, 489, 
496 
resolving motion of, 487, 489 
speed of, comparison of in air and 
water, 391, 489, 49 
determined by Fiseau and 
Breguet, 489, 496 
first determined by Foucault, 489, 
496 
wave theory of, 500, 524 
light ray, 492 
limit, elastic, 292 
line of nodes, 342, 565 
linear first order PDE, 668 
linear triatomic molecule, 480 
linearization, 324 
canonical form for, 324 
linearize Lagrange equations near an 
equilibrium point, 477 
link of chain, 102-105 
yanked, 102, 105 
Liouville, 536, 576, 614, 641 
Liouville’s volume theorem, 536, 538, 
576 
in thermodynamics, 538 
Liouville integrability, 614, 641 
Liouville integrability theorem, 616, 
620, 628, 640 


Index 721 


classical version, 616 
time-independent, 619, 620 
Liouville tori, 623 
Liouville-Arnold, 641 
Liouville-Arnold tori, 623 
liquid, 378 
rotating, 379 
as speedometer, 408 
Lissajous, 295 
Lissajous figure, 294, 620 
Lloyd, 526 
load, normal, 412 
Lobachevsky song, 599 
logarithmic spiral, 126 
longest axis, stable rotation about, 336 
longitude of ascending node, 565 
longitudinal waves, 501 
Lorentz force law, 52, 199 
low shot, billiard, 419 


Mecanique Analytique, 439 
Mach, 5, 282, 393 

Einstein letter to, 394 
Mach’s principle, 393, 394 
MacTutor History of Mathematics, 

466 

Magic Formula, Cartan’s, 576, 595 
magnet, 9, 23 
magnetic field, 52, 199 
magnetism, 52 

modern conception of, 24 
Malus, 498 
Malus’ theorem, 499, 523, 524 
Mars ‘Trojans, 401 
Marsden and Hughes, 257 
Marsden and Ratiu, 576, 596, 658, 

659 

mass, 8, 16, 19, 37 

additivity of, 16, 35 

center of, 80, 190 

coordinates, 94 

distinction from weight, 20 

effective, 248 

having the same, 28 

measuring, 18 


722 


of planets, 162 
operational definition of, 15 
point, 3, 10, 175 
reduced, 116 
rest, 43 
test, 52 
variable, 35, 100 
matter, quantity of, 8, 16 
Maupertuis, 464, 496 
biography of, 466 
Maupertuis’ principle of least action, 
466, 522, 527, 592 
maximis and minimis, 489 
Maxwell, 167, 501 
McOwen, 503 
measuring mass, 18, 20 
mechanical analogies for light, 487, 
489, 496 
mechanical engineering, 110, 206 
mechanics and optics, 522, 555 
mechanics 
analytical, 439 
specialized considerations for, 440 
applied, 110, 206, 417 
characteristic function for, 527 
continuum, 468 
derived from symmetry, 470 
engineering, 110, 417 
Hamiltonian, 487, 509, 532 
relation to quantum mechanics, 
487 
Lagrangian, 220, 226, 230, 437, 439 
Newtonian, 7 
quantum, 487, 653 
relation to Hamiltonian mechan- 
ics, 487 
wave, 641 
Mehra and Rechnenberg, 646 
metaphysical, 464 
method of descent, 506 
method of exhaustion, 4 
Meyer, 410 
Miller and Ross, 319 
Muineur, 640, 641 
mirages, 494 
mirrors, 500 


Misner, Thorne, and Wheeler, 40 


Index 


misspent youth, 420 
mode(s), normal, 303, 305, 311ff, 315, 
478 
normal, by cleverness, 311ff, 314 
in sync or anti-sync, 479 
modulus of elasticity, 253 
Moffatt and Shimomura, 433 
molecular adhesion, 410 
molecule, linear triatomic, 480 
moment(s), 3 
of force, 85 
of inertia, 190, 191 
for continuous body, 196, 203 
of momentum, 86, 200 
principal, of inertia, 189 
momentum, 9 
angular, 83 
conservation of, 470, 475 
of system, 83 
rotational, 84, 187 
change of, 12 
conjugate, 445, 634 
conservation of, 32, 80, 469 
as vector equation, 87 
first statement of, 273 
Newton’s statement of, 273 
conserved, 22 
law, 80, 186 
moment of, 86 
total, 22, 29 
Monge, 499 
Monge cone, 679 
monkey, 111 
moon, 16, 25, 67, 169 
always facing earth, 170 
orbit of, 131 
tidal forces on, 170 
Morin, 410 
Moser, 60 
motion 
absolute, 12 
friction in opposite direction of, 412 
integral of, 605 
quantity of, 9 
relative, 12 
resolving, for light, 487, 489 
rigid body, 332 


possible, 185 

geometric description of, 336ff 
motion of light, resolving, 487, 489 
movement of gyroscope gimbal, 357 
moving charged particle, 52 
Muggles, 576 
multiple-valued coordinates, 623 
multiple-valued function 
multiplier, Lagrange, 233, 451 

for Euler equation, 27] 


Nahin, 271 
Napoleon, 494 
Nature, 463, 489, 496, 497 
Navarro and Perez, 646 
negatively charged body, 559 
Neher and Leighton, 11 
Neptune ‘Trojans, 401 
Nestor, 400 
neutral plane, 254 
neutron, 24, 113 
Newton, 7, 382, 489, 495, 50 
and “absolute space”, 393 
Newton’s 
analysis of central forces, 55ff. 
cradle, 108, (also 491) 
first law, 10 
geometric proof concerning gravity, 
137, 138 
geometric proof of Kepler’s second 
law, 5O6ff 
Laws, 3, 7 
proof of parallelogram law, 278 
second law, 12 
statement of parallelogram law, 278 
theorem of revolving orbits, 130 
third law, 21 
for gravity, 39 
Newton’s proof of, 276, 277 
strong form, 79, 84, 179, 199 
‘Three Laws, 10 
Newtonian Mechanics, 7 
node, ascending, 565 
longitude of, 565 
node, descending, 565 
nodes, line of, 342, 565 
Noether’s theorem, 467, 610 


Index 723 


Hamiltonian form of, 606 
non-homogeneous media, 494 
non-inertial coordinate system, 332, 

376 
non-integrable distribution, 232 
non-intuitive nature of third law, 23 
non-isotropic media, 494 
non-planar rigid body, 177ff 
non-upright rolling disc, 231 
nondegenerate, 512, 599 
nonsharp signals, 507, (also 502) 
normal component of force, in friction, 
409 
normal load, 412 
normal mode(s), 303, 305, 31 1ff, 315, 
478, 

by cleverness, 31 1ff, 314 

in sync or anti-sync, 479 
normal slowness, vector of, 526 
normal, unit, 139 

inward or outward pointing, 139 
notation, condensed, 529, 531 
nucleus, helium, 148 
nucleus, radius of, 150 
number, angular wave, 313 
number, wave, 313 
nutation of gyroscope, 357 
nutation of top, 349 

rate of compared to rate of spinning, 

35] 
size of compared to rate of spinning, 
351] 


precession without, 352 


objects falling, 276 

ODE’s, second order, 532 

Odysseus, 400 

Olbers, 385 

Olbers’ paradox, 385 

old fogeys, 412 

one-body problem, 120 

one-dimensional problem, reduction to, 
128, 144 

Onnes, 391, 392, 664 

operational definition, 8 

of mass, 15 


724 


opposite electric charges, 24 
optical instrument, 524 
optical length, 523 
optics 
and mechanics, 522, 555 
characteristic function in, 524, 527 
geometrical, 500, 524, 557 
in antiquity, 487 
wave theory of, 557 
Optics of Euclid, 487 
Optiks of Newton, 495 
Queries at end of, 496 
orbit 
of moon, 131 
centered at focus, conic section, 124 
circular, 67 
conic section, centered at focus, 124 
elliptical, 16 
centered at focus, 124 
centered at origin, 123 
general nature of, 128 
hyperbolic, 126 
centered at focus, 124 
parabolic, 73 
centered at focus, 124 
period of, 125 
depending on energy, 125 
depending on length of semimajor 
axis, 125 
repeating sections of, 129 
revolving, 129 
shape of, 75, 121 
stationary, 396 
with parameterization, 12] 
orbital plane, 565 
origin, reach, 160 
orthogonal map, invariance of nature 
under, 280 
orthogonal transformation, derivative 
of 18] 
orthogonalization, Gramm-Schmidt, 
600 
oscillation(s), 287, 296 
damped, 295, 390 
forced, 298 
damped, 300 


Index 


underdamped case, 301 
in anti-sync, 303 
in sync, 303 
quality of, 297 
small, 476 
about equilibrium point, 478 
Lagrangian mechanics treatment 
of, 476 
of a system, 478 
of spherical pendulum, 291 
sum of, 478 
oscillator, harmonic, see harmonic 
oscillator 
oscillators, connected by spring, 302 
oscillators, coupled, 302, 479 
oscilloscope, 293 
osculating circle, 62 
Osgood, 265 
Ostberg, 426 
Ostrogradsky’s theorem, 138 
overdamped, 296 


Panlevé paradoxes, 412, 414 
Painlevé-Klein, 412 
Palais and Palais, 164, 326 
Panthéon, 387 
parabola, projectile’s path, 279 
parabolic orbit, 73 

centered at focus, 124 
parallel axis theorem, 191, 196 
parallel curves, for involutes, 327 
parallelogram construction, 28 
parallelogram law, 278 

essential hypothesis for all proofs, 

2808f 

Newton’s statement and proof, 278 
parameter, impact, 147 
parameters, variation of, 298 
Paris exhibition of 1851, 387 

Panthéon, 387 
Pars, 559, 594, 640 
particle, 10 
particles of ether, 491, 563 
Patroclus, 400 
Pauli, 427 
PDE 

first order, 540, 542, 668, 677 


linear, 668 
first order quasi-linear, 673 
PDE Primer, 542, 667 
pendulum, 20, 22, 287, 625 
equation, 47, 209 
compound, 211 
configuration space for, 209 
coordinate system on, 209 
cycloidal, 288 
double, 229, 307, 480 
tension on string for, 480 
for gravitational acceleration, 68 
Foucault’s, 387, 661, 664 
intuitive, geometric, explanation 
for, 388 
path of, 389 
cusps in, 390 
loops in, 390 
popular explanation for, 387 
small oscillations of, 389 
interchanging potential and kinetic 
energy, 93 
Kater, 258 
period of, 50 
phase portrait of, 625, 626 
physical, 211 
rotating, 620 
spherical, 112, 290, 390, 479, 620 
phase portrait of, 627-628 
small oscillations of, 291 
swinging in plane, 112 
using Lagrangians, 457 
string, tension on, 210 
using Lagrangians, 457 
using constraints, 208 
Perez, 646 
perfectly elastic, 206, 249, 491 
perfectly rough, 225, 249 
periapsis, 128 
smallest, 174 
pericenter, 128 
perigee, 128 
perihelion, 128, 565 
argument of 565 
period, 287 
Period Lemma, 133, 151 
period of orbit, 125 


Index 725 


depending on energy, 125 
depending on length of semimajor 
axis, 125 
period of pendulum 50 
periodic orbits, near Lagrange points, 
400, 401 
periodic, conditionally, 620 
perpendicular axis theorem, 202 
perpendicular, curve traced by axis of 
top, 349 
Persson, 381, 392 
perturbation, 132, 646 
phase, 287 
difference, 294 
portrait, 624 
shift, 302 
philosophical questions, 273 
Philosophical Transactions of the Royal 
Society, 527, 540 
physical pendulum, 211 
physicist’s way of looking at things, 378 
physics trick, 378 
piano, 316 
Picard, 68 
pin-hole camera, 487 
Pisces, 361 
Plague Years, 67 
Planck, 498 
Planck’s constant, 498, 555, 557, 
plane 
invariable, 337 
neutral, 254 
of the ecliptic, 565 
planet, 16 
elliptical orbits of, 37 
hidden, 404 
mass of, 162 
Planet-X, 404 
plank, supported, 254 
plank, thin, bending, 256 
platonic solid, 204 
Pliskin, 428 
plumb bob, 381 
deflection due to bulging of earth at 
equator, 381 
real versus theoretical 


726 


Poincaré, integral invariants of, 572, 
573 
Poincaré—Cartan, integral invariant of, 
574 
Poinsot, 336, 39] 
point mass, 3, 10, 175 
point transformation, 596 
Poisson, 279, 387, 466 
bracket, 605, 608 
invariant definition of, 606 
well-defined, 606 
fundamental, 607, 609 
Poisson’s method of spherical means, 
503 
Poisson’s theorem, 608, 610 
polar coordinates, area in, 56, 630, 631 
polar coordinates, length in, 160 
polar cuspidal top, 355 
polarization, 495, 500 
identity, 600 
polhode, 337, 366 
cone, 366 
polished surfaces, 411 
portrait, phase, 624 
positively charged body, 559 
potential energy, 87, 89, 90 
effective, 144 
in Principia, 142 
only determined up to a constant, 
89 
potential function, 270 
and equilibrium, 214 
potential, Higgs, 166 
Pourciau, 57, 60, 73, 279, 283 
power (n") force, 122ff, 135, 
power force laws, duality, 154 
preémptive strike, 8, 567 
precessing tops, 352 
stability of, 353 
precession, Euler, 339, 381 
precession, true, 353 
precession, regular of top, 340 
precession of equinoxes, 340, 361 
by d’Alembert, 36] 
exercise in Goldstein [1], 361 
Newton’s calculations, 36] 
precession of gyroscope, 356, 357 


Index 


precession of top, 349 

“fast”, 353 

rate of compared to rate of spinning, 

351 

“slow”, 353 

starting on its own, 352 

without nutation, 352 
preserving Lagrangian, 467 
Priamus, 400 
Princeton, 108 
principal, axis of inertia, 189 
principal function, 527 

partial derivatives of, 530 
principal moment, of inertia, 189 
principal vectors, rotating, Euler equa- 

tions for, 362 

Principia, 7, 141, 490 

conservation of energy in, 14] 

potential energy in, 142 
product neighborhood lemma, 628 
product of inertia, 190 
product, interior, 512 
propagation speed, finite, 502 
proton, 24, 113 
pulley, 241, 245 
pure rotation, in rolling, 221, 222 


quadratic form, associated with sym- 
metric matrix, 568 
quadratic functions, homogeneous, 44 
Euler’s theorem on, 448 
quadratures, 614, 634 
quality of oscillation, 297 
quantity 
of action, 497 
of matter, 8, 16 
of motion, 9 
quantization, royal road to, 640 
quantum mechanics, 48, 653 
relation to Hamiltonian mechanics, 
487 
quantum theory, 501 
quark, 24 
quasi-linear first order PDE, 673 
Quetelt, 499 


Index 727 


radially symmetric central force, 74, 88 
radius of earth, 16 
radius of gyration, 213 
radius of nucleus, 150 
range of influence, 502, 506, 507 
rank, of (skew)-mner product, 599 
rate of nutation of top, compared to 
rate of spinning, 351 
rate of precession of top, compared to 
rate of spinning, 351 
rate of spinning, compared to rate of 
nutation, 351 
rate of spinning, compared to rate of 
precession, 35] 
rate of spinning, compared to size of 
nutation, 35] 
Rauch-Wojciechowski, Skéldstam, and 
Glad, 432 
ray of light, 492 
ray, incident, 522 
ray, refracted, 522 
Rayleigh [Lord Rayleigh] (John 
William Strutt, Baron Rayleigh), 
644, 645, 646 
reach origin, 160 
reciprocal spiral, 126 
Reddingtus, 407 
reduced Hamilton-Jacobi equation, 554 
reduced mass, 116 
reduction to one-dimensional problem, 
128, 144 
reflection 
angle of, 249, 487 
law of, 487, 492, 493 
reflection and refraction, laws of, 500 
refracted ray, 522 
refraction 
conical, 526 
external, 526 
internal, 526 
Refractiones, Analysis ad, 490 
Refractiones, Synthesis ad, 490 
refractive index, 497, 522 
regular Lagrangian, 482, 517, 533 
regular precession of top, 340 
relative integral invariant of Poincaré, 


973 


relative motion, 12 
relativity principle, Einstein’s 275 
relativity principle, Galilean, 275 
relativity, general, 39, 394 
Schwarzschild solution in, 640 
relativity, special, 12, 17, 34, 175, 200, 
394, 487 
Religion and Natural Science, 498 
remainder, integral form for ‘Taylor’s 
formula, 659 
repeating sections of orbits, 129 
repeating sections, for herpolhode, 338 
resolution of forces, 30 
rest mass, 43 
restitution, coefficient of, 94 
restricted three-body problem, 396, 
559 
revolution, ellipsoid of, 494 
revolving elliptical orbit, 131, 166 
revolving orbits, 129 
Newton’s theorem of, 130 
Riccati, 270 
Richer, 20 
Riemannian manifolds, geodesics on, 
46] 
Riemannian metric, 476, 482 
rigid body, 26, 81, 175ff, 491 
in equilibrium, 176 
bending of, 206 
collision of, 206 
in conservative force, 194 
in contact, 205ff. 
in equilibrium, 175 
kinetic energy of, 193 
motion, 332 
geometric description of, 336f. 
non-planar, 17741. 
possible motion of, 185 
rotation about an axis, 334 
work done by internal forces of, 195 
rigid equilibrium, 176 
rigid motion, 178 
rigid solution for, 185 
rigor, gone with the wind, 653 
ring, college, 426, 430 
rising top, 354 
rocket, 32 


728 


rocket launches, 117 
rocket science, 32 
Roemer, 491 
Roll, Krotkov, and Dicke, 40 
rolling, 220, 222, 418 
and frictional forces, 221 
body cone on space cone, 341 
inertia ellipsoid, 337 
not, 417, 422 
velocity of contact point, 221, 222 
what is going on, 223 
rolling condition, 222, 453 
rolling disc, 230, 362 
along circle, 365 
along straight line, 364 
stability of 364 
down inclined plane, 235 
Lagrange’s equations for, 452 
non-upright, 23] 
rolling friction, 411 
rolling sphere, 240 
rolling wheel, speed of, 226 
Roma caravans, 356 
Ronchi, 487, 488, 489, 496 
rope, 241 
wrapped around object, 245 
rotating 
axes, within body, 424 
bucket, 393 
coordinate system, 332 
distinguishing from non-rotating, 
383 
earth, 376, 377 
elliptical path, 292 
liquid, 379 
as speedometer, 408 
observer’s derivative, 333 
pendulum, 620 
principal vectors, Euler equations 
for, 362 
rotation 
about an axis, 334 
stable, 335 
unstable, 335 
about axis of top, 349 
about longest axis, stable, 336 
in phase portrait, 626 


Index 


pure, in rolling, 221, 222 
rigid body about axis, 191, 193 
rotational 
angular momentum, 84, 187 
part of kinetic energy, 194, 204 
for continuous body, 196, 204 
symmetry, 470, 475 
rough, perfectly, 225, 249 
Routh, 430 
Routh integral, 430 
Royal Irish Academy, 522 
Royal Insh Academy, Transactions of, 527 
royal road to quantization, 640 
Royal Society, Philosophical Transactions 
of, 527, 540 
rubber band, 295, 298 
Rutherford, 126 
experiments of, 148 
scattering formula, 148 


Rutherford scattering, 147 


Sabra, 490 

saddle point, 325 

sagitta, 59 

Sahl,, 488 

“same” solutions for Lagrangians, 459 

Santos-Benito and Gras-Mati, 258 

scale, 14, 46 

scattering angle, 114, 147 

laboratory, 115 

scattering formula, Rutherford, 147, 
148 

Schlebusch, 405 

Schrédinger, 555 

Schrédinger wave equation, 509, 555, 
558 

Hamilton-Jacobi theory and, 555 

Schulz-Dubois, 391 

Schwarzschild, 640 

Schwarzschild solution in general rela- 
tivity, 640 

seat-of-the-pants approximation for 
Tipped ‘Top, 428 

second law, Kepler’s, 55 

second law, Newton’s, 12 

second order ODE?’s, 532 

secondary wave, 491, 563 


self-braking, 414 
semi-period, 297 
semimajor axis, 166 
separation of variables, 75, 314, 540, 
636 
separatrix, 626 
shape of orbit, 75, 121 
Shapiro, 392 
sharp signals, 506 
shattering experience, 104 
shift 
lagging, 301 
leading, 301 
phase, 302 
side english, 421 
Siegel and Moser, 594 
signals, 
nonsharp, 507 (also 502) 
sharp, 506 
simple harmonic motion, 287, 289, 
293 
electrical examples for, 294 
simultaneous diagonalization of matri- 
ces, 478 
sine law (Snell’s law), 488, 522 
singular solution, 543 
size of nutation of top, compared to 
rate of spinning, 35] 
skew-adjoint transformations, 18] 
skew-inner product, 599 
matrix of 599 
rank of 599 
skew-orthogonal, 601 
skew-symmetric (alternating) map asso- 
ciated with skew-symmetric matrix, 
568 
sleeping top, 353 
stability of, 354 
sliding, 215 
down inclined plane, 215 
down wedge, 217 
configuration space for, 217 
without friction, 216 
sliding and spinning, 417 
sliding friction, 410 
slow precession, 353 
slowness, vector of normal, 526 


Index 729 


small angle, pendulum equation, 50 
small oscillations, 476 
approximations to, 292 
about equilibrium point, 478 
Lagrangian mechanics treatment, 476 
of a system, 478 
of spherical pendulum, 291 
sum of, 478 
smallest value of impact parameter, 150 
Smith, 319 
sn, elliptic function, 372 
snakelike, 338 
Snel, Wilebrord van Royen, 488 
Snell, 488 
Snell’s law, 488 
Newton’s argument for, 496 
soft body, 94 
SOHO (Solar and Heliospheric 
Observatory), 404 
solenoid, 52 
solid body, bending of, 206 
solutions for Lagrange’s equations, 1n- 
tegral for, 468 
Sommerfeld, 35, 98, 102, 287, 307, 
636, 640, 646 
sound, analogous to light, 490 
southward deflection 
of falling body, 386, 405 
of hanging body, 380 
not possible to detect, 381 
space cone, 341 
body cone rolling on, 341 
space, absolute, 12, 275 
space, homogeneity of, 469 
spar see Iceland spar 
special relativity, 34, 175, 200, 394, 
487 
specific heat, 98 
spectacles (eye-glasses), 488 
speed of light, 491 
in air and water, 391, 489, 496 
determined by Fiseau 
and Breguet, 496 
determined by Foucault, 496 
speed, finite propagation, 502 
speedometer, from rotating liquid, 408 
sphere, rolling, 240 


730 


spherical bodies, attractive force of, 137 
spherical coordinates, 290 
spherical means, Poisson’s method of, 
503 
spherical pendulum, see pendulum, 
spherical 
spin, 236 
opposite direction for gyroscope, 
257, 358 
spinning and sliding, 417 
spinning ball, 249 
spinning football, 432 
spiral 
Cotes’, 127 
equiangular, 126 
hyperbolic, 126 
logarithmic, 126 
reciprocal, 126 
Spivak, 101 
spring, 9, 14, 15, 47 
connecting oscillators, 302 
constant, 47, 292 
equinox, 361 
stiffness, 316 
stretching, 292 
sprocket wheel, chain and, 246 
square, inverse force, 16, 58, 60, 67, 75 
exactness of, 135 
stability, 213ff 
for solutions of differential equations, 
324, 326 
stable equilibrium, 215, 476 
stable rotation about an axis, 335, 336 
standing wave, 317 
static friction, 410, 411 
statically indeterminate, 206, 252 
“statics, dynamics reduces to”, 184 
stationary orbit, 396 
steady state, 301 
Stefan-Boltzmann law, 646 
Steiner’s theorem, 19] 
stiffMess of spring, 316 
straight down, 381 
strain, 253 
stress, 253 
stretching, spring, 292 
stretching, wire, 292 


Index 


strike, preémptive, 8, 567 
string 
continuous, 314 
vibrating, 310ff, 557 
continuous and discrete cases, 
31 0ff 
strip condition, 683 
strip manifold condition, 691 
strong form of Newton’s third law, 25, 
79, 84, 179, 199, 470 
subgroup of R”, discrete, 622, 641, 
642 
subspace, isotropic, 601 
Superball, bouncing, 249 
SuperBall® (Wham-O®), 119 
superconductivity, 391 
surfaces, envelope of family of, 323 
surfaces, polished, 411 
suspension bridge, 269 
suspension, Cardan, 356 
swivel top, 345 
symmetrical top, heavy, 345ff. 
symmetry, 25, 200, 469 
rotational, 470, 475 
used to derive laws of mechanics, 
470 
symplectic 
area-preserving in 1 dimension, 630 
as canonical, 603 
basis, 601 
coordinate system, 604 
diffeomorphism, 608 
manifold, 599, 604 
map, 608 
transformation, 601 
vector space, 599 
sync, 479 
sync, oscillations in, 303 
Synge and Griffith, 257, 363, 372 
Synthesis ad Refractiones, 490 


‘Tantalus, 111 

taut, 242 

tautochrone problem, 318 
tautochronous, 288 
‘Taylor’s theorem, 314 


integral form of remainder, 659 
teleological principles, 463, 490, 498 
telescopes, 488 
temperature change of Carnot energy 

loss, 98 
tension, 243, 267, 308, 310 
on pendulum string, using 
Lagrangians, 457 
on spherical pendulum string, using 
Lagrangians, 457 

on string for double pendulum, 480 
Terrall, 466 
test mass, 52 
thermodynamics, 98, 538, 644 
thin plank bending, 256 
third law 

built into Lagrange’s equations, 470 

for gravity, 39 

for rockets, 32 

Kepler’s, 37, 67, 125, 162 

Newton’s, 21 

proof of, 276, 277 

non-intuitive nature, 23 

reason for in gravitation, 25 

strong form of, 25, 79, 84, 179, 199, 

470 
Thompson, 53 
three-body problem, 396, 532, 644 

restricted, 396, 559 
tidal forces, 168 

on moon, 170 
tides, 25, 168 
time, as another variable, 227 
time-dependent 

constraints, 227, 449 

Hamiltonians, 590 

Lagrangians, 449 

conservation of energy for, 450 
time-independent Hamiltonians, 590 
time of descent, 289, 385 
Tippe Top, 426 
Tobin, 391, 392, 496 
top, see, in particular, free symmetric 

top 
top, heavy, 339, 445 
conditionally periodic motion of, 621 
cuspidal case, 349f1., 356 


Index 731 


fast, 34 9ff- 
symmetrical, 345ff 
phase portrait for, 628 
lab, 345 
nutation, 349 
rate of compared to rate of spin- 
ning, 351 
size of compared to rate of spin- 
ning, 35] 
perpendicularity of curve traced by 
axis in cuspidal case, 349 
polar cuspidal, 355 
precessing, 352 
stability of, 353 
without nutation, 352 
precession, 349 
rate of precession, compared to rate 
of spmning, 351 
regular precession of, 340 
rising, 354 
rotation about axis, 349 
sleeping, 353 
stability of, 354 
start precessing on its own, 352 
swivel, 345 
symmetric, 339 
free, 339 
in body coordinates, 339 
in inertial coordinates, 340 
top, toy, 345 
top, with spherical bottom, 431 
‘Top, ‘Tippe, 426 
tori 
Arnold, 623 
invariant, 621, 623 
Liouville, 623 
Liouville-Arnold, 623 
periodic vs. dense trajectories on, 
627, 636 
torque, 84, 85 
torque 0, for central force, 85 
torque coming from friction, 427 
torsion balance, 39 
torus, Invariant, see invariant torus 
total momentum, 22, 29 
toy top, 345 
Traité de lumiere, 490 


732 


Transactions of the Royal Insh Academy, 
527 
transfer of energy, 305, 309 
not complete, 306 
transform, Legendre, see Legendre 
transform 
transient, 301 
translational force, 377 
translational part of kinetic energy, 194 
transversal waves, 501 
Treatise on Light, 490 
‘Trefethen, 392 
triatomic molecule, linear, 480 
tribology, 409 
trick 
old-fashioned, for Leibnizian nota- 
tion, 75, 88, 100 
physics, 378 
‘Trojan asteroids, 396 
Trojan war, 400 
‘Trojans, Mars, 401 
‘Trojans, Neptune, 401 
true anomaly, 163, 643 
true precession, 353 
Truesdell, 200 
tube, 574 
tubs, bath-, 392 
tuning fork, 295 
frequency of, 295 
standard, 298 
two-body problem, 120, 136, 162 
two fixed masses, 559 
type | generating function, 578 
type 2 generating function, 580 617, 
633, 652 
type 3 generating function, 588 
type 4 generating function, 588 


underdamped, 296 
UNESCO Courier, 356 
uniform gravity, 13, 14, 204 
unit normal, 139 
inward or outward pointing, 139 
universal invariants, 593, 598 
University of Dublin, 522 
unstable equilibrium, 477 
unstable rotation about an axis, 335 


Index 


upright rolling disc, 230 
compared to rolling sphere, 230 
down inclined plane, 235 
upwards, shooting cannonball, 386 


van der Waerden, 6 
variable mass, 35, 100 
variables, separation of, 75, 314, 540, 
636 
variation (of a function), 461 
variation of parameters, 298 
variation vector field, 467 
variational principle, 487, 461 
vector(s), 9 
cross-product of, 55 
vector of normal slowness, 526 
vector space structure, 28 
velocity of contact point, in rolling, 
221222 
velocity 
vector, 9 
angular, 83 
escape, 117 
vernal equinox, 565 
vibrating string, 310f1, 557 
continuous and discrete cases, 31 0ff. 
virtual infinitesimal displacements, 
17 8Ff. 
virtual work, principle of, 181 
viscous friction, 411 
vision, 487, 488 
volume theorem, Liouville’s, 536, 538, 
576 
in thermodynamics, 538 


Vorlesungen tiber Dynamik, 466, 549 


Wallis, 22, 273 
water-wheels, 381 
water waves, 507 
wave, 317, 555, 490 
electromagnetic, 501 
equation 
1-dimensional, 316 
Schrodinger, 509, 555 
Hamilton-Jacobi theory and, 
555 
front, 524, 526, 562 


length, 313 
longitudinal, 501 
mechanics, 641 
number, 313 
angular, 313 
secondary, 491 
standing, 317 
theory of light, 500, 524, 557 
transversal, 501 
water, 507 
weak coupling, 304 
wedge, sliding down, 217 
configuration space for, 218 
wedged, 414 
weight, 19, 37 
decrease approaching equator, 380 
depending on distance from center 
of earth, 20 
distinction from mass, 20 
Weinstein, 599, 604 
Wells and Siklos, 648 
Wham-O® SuperBall® 119 
wheel(s) 
paradoxes of, 220 
Galileo, 220, 224 
rolling, speed of, 226 


Index 733 


sprocket, chain and, 246 
water-, 38] 
whip, 100, 106 
Whittaker, 572, 590, 640 
Wikipedia, 119 
Wintner, 283 
WMAP (Wilkinson Microwave 
Anisotropy Probe), 404 
wobble, Chandler, 340 
Wolf, 400 
Wong and Yasui, 107 
work, 91, 181 
done by internal forces of rigid body, 
195 
modern meaning of, 420 
virtual infinitesimal, 181 
virtual, principle of, 181 
wrapped around object, 245, 267 
Wren, 22, 67, 273 
wrench, 201 
yanked link of chain, 102, 105 
Yogi Berra, 461 
Young, 500, 524 
Young’s inequality, 520 
Young’s interference experiment, 500 
youth, misspent, 420 


